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Description 

Over 600 different carolenoids have been described from caret enogenlc organisms found among bacteria, yeast, 
fungi and plants. Currently only two of them, p-carotene and astaxanthin are commercially produced in microorganisms 

5 and used in the food and feed industry, p-carotene is obtained from algae and astaxanthin is produced in Pfaffia strains 
which have been generated by classical mutation. However, fermentation in Pfaffia has the disadvantage of long fer- 
mentation cycles and recovery from algae is cumbersome. Therefore it is desiderable to develop production systems 
which have better industrial applicability, eg. can be manipulated for increased titers and/or reduced ferrnentation 
times. Two such systems using the biosynthetic genes form Erwinia herbicola and Erwinia uredovora have already been 

10 described in WO 91/13078 and EP 393 690.' respectively. Furthermore, three p-carotene k'etolase genes (p-carotene p- 
4-oxygenase) of the marine bacteria Agrobacterium aurantiacum and Atcaiigenes strain PC-1 (crtW) [Misawa, 1995, 
Biochem. Biophys. Res. Com. 209. 867-876]IMisawa. 1995. J. Bacteriology 177, 6575-6564] and 'from the green algae 
Haematococcus pluvialis (bkt) [Lotan; 1995,- FEES Letters 364. 125-128][ Kajiwara. 1995,.PIant Mol. Biol. 29,.343:352] 
have been cloned. E. coli carrying either the.carotenogenic genes (crtE. crtB. crtY and crtl) of E. /7er)t)/co/a [Hundle, 

75 1 994, MGG 245. 406r41 6] or of E. urecfowpra and jjornpl emented with the, crtvy geriejpf X aurantiacum [Misawa; 1 995] 
or the bkt gene of H. pluvialis [Lotan! f995i[Kajiw^^^^^^^ 1995] resulted in the accumijlatibri of canthaxahthm (p.p^caro- 
tene-4,4'-dione). originating fronn the conversion of p-carotene. via the intermediate echinenone (p,p-cardtene-4^one). 
Introduction of the above mentioned genes (crtVV^or bkt) into E. coli cells harbouring besides the^carotemDid^ 
sis genes mentioned abovealso the crtZ gene.pfiiE: uredovora [Kajiwara, 1995][Misawa, 1995], resulted in both cases 

20 in the accumulation of astaxanthin (3,3:Tdihydrpxy.-p.Prcaroten(B-4.4'-dione). The results obtained yyith the bkt gene, are 
in contrast to the observation made^by others; [Lptan; ^1995], who using the same experi'mehtal set-up, but intrdciucing 
the H, pluvialis bkt gene in a zeaxanthin (R.P:carqtener3,3'-diol) synthesising E. coli host harbouring the carotenqid bio- 
synthesis genes of E. herbicola, a close relative of the above mentioned E. uredovora strain, did. not observe. astaxan- 
thin production. . ' (s- j;. I'i^:-^'! 

25 Since there is a continuing need in even more optimized fermentation systems for industrial application it is there- 

fore in the^first instance an object of the present invention to provide a process for the preparation of canthaxanthin by 
culturing under suitable culture conditions a cell which is transformed by a DNA sequence comprisirig the foljowing DNA 
sequences: • 

30 a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous; . .. . • :;'j;v'v ... : 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

35 '' ' ' / . ' * ' 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. Rl'534 (crtl) or a DNA 
sequence which Is substantially homologous; . V. . ' 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence ' 
40 which is substantially homologous; ' *' 

e) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E-396 (FERM BP-4283) 
[crtWEsse] or a DNA sequence which is substantially homologous; 

45 or a cell which is transformed by a vector comprising DNA sequences specified above under a) to e) and by isolat- 
ing canthaxanthin from such cells or the culture medium by methods known in the art. 

Furthermore it is in the second instance an object of the present invention to provide a process for the preparation 
of a mixture of adonixanthin and astaxanthin or adonixanthin or astaxanthin alone by a process as mentioned above 
so characterized therein that in addition to the DNA sequences specified under a) to e) the following additional DNA 
sequence is present: 

f) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E-396 (FERM BP-4283) 
IcrtZE396] or a DNA sequence which is substantially homologous; 

55 

and the DNA sequence specified under e) is as specified above or the following sequence: 

g) a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crtW) or a DNA 
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sequence which is substantially homologous; 

and isolating the desired mixture of adonixanthin and astaxanthin or adonixanthin or a astaxanthin alone from such 
cells of the culture medium and separating the desired mixture or carotenoids alone from other carotenoids which 
might be present by methods known in the art. 

Furthermore it is an object of the present invention to provide a process for the preparation of zeaxanthin by a proc- 
ess as claimed in the first instance characterized therein that the DNA sequence as specified under e) Is replaced by 
the DNA sequence as specified under f) in the second instance and by. isolating zeaxanthin from the cell or the culture 
medium and separating it from other carotenoids which might be present by methods known in the art. 

Furthermore .it is an object of the present invention to provide a process for the production of adonixanthin by cul- 
turing under suitable culture conditions a cell which is transformed by a DNA sequence comprising the following heter- 
ologous DNA sequences: 

a) a DNA sequence which encodes the GGPP synthase of the microorganism E-396 (PERM ,BP-4283) [crtEEsgg] 
or a DNA sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase the microorganism. E-396 (PERM BP-4283) 
[crtBEsgs] or a DNA sequence which is substantially homologous; 

c) a DNA sequence which encodes the phytoene desaturase of the microorganism E-396 (PERM BP-4283) 
[crtlesgg] or a DNA sequence which is substantially homologous; .f . . , v 

d) a DNA sequence which encodes the lycopene cyclase of the microorganism E-396 (PERM BP-4283) [crtYEsge] 
• or a DNA^sequence which is substantially homologous; ^ 

e) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E396 (PERM BP-4283) 
^ IcrtZEsgg] or a DNA sequence which is substantially homologous; and . r ■ . 

. • ■ , . • . •,..». . • . 

f) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism. E396 (PERM BP-4283) 
[crtWE396] or a DNA sequence which is substantially homologous; . j 

and isolating adonixanthin from the cell or the culture medium and separating it from other carotenoids which might 
be present by methods known in the art. 

Further it is an object of the present invention to provide a process as described above characterized therein that 
the transformed host cell is a prokaryotic host cell, like E. coli. Bacillus or Flayobacter and a process as described above 
characterized therein that the transformed host cell is a eukaryotice host cell, like yeast or a fungal cell. 

Furthermore it is an'object of the present invention to provide a DNA sequence comprising one or more DNA 
sequences selected from the group consisting of: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or. a DNA sequence 
which Is substantially homologous; 

b) a DNA sequence which ehcodes .the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; . • . 

c) a DNA sequence which encodes the- phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
sequence which Is substantially homologous; . . . • 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence 
which is substantially homologous; and • 

' e) a DNA sequence which encodes the p-carotene hydroxylase of Plavobacteriunfi sp...R1534 (crtZ) or a DNA 
sequence which is substantially homologous. :• i 

It is also an object of the present; invention to provide a vector comprising such DNA sequence, preferably in the 
form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is transformed 
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by such DNA sequence or vector, preferably which is a prokaryolic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 
a mixture of carotenoids by cuituring such a cell under suitable culture conditions and isolating the desired carotenoid 

5 or a mixture of carotenoids from such cells or the culture medium and. in case only one carotenoid is desired separating 
it by methods known in the art from other carotenoids which might also be present and a process for the preparation of 
a food or feed composition characterized therein that after such a process has been effected the carotenoid or carote- 
noid mixture is added'to food or feed. . • . 

Furthermore, a DNA sequence comprising the following DNA sequences is an object of the present invention: 

10 ' . . ■ • T. .. . ; 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous;- 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) of a DNA 
16 sequerice which is substantially homologous; and : 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R 1534 (crtl) or a DNA 
sequence which is substantially homologous.. • . - 

20 It is also an object of the present Invention to provide a vector comprising such DNA sequence, preferably in the 

form of an expression vector. Furthermore it is'an object of the present invention to provide a cell which is transformed 
by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. cpli or a Bacillus 
strain. Such transformed cell which Is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally tfie preserit invention concerns also a process for the preparation of a desired carotenoid or 

25 a mixture of carotenoids by cuituring such a cell under suitable conditions and isolating the desired carotenoid or a mix- 
ture of carotenoids from such celts or the culture medium and, in case only one carotenoid is desired separating it by 
methods kndvvn in'the art from other carotenoids which might also be present, preferably such a process for the prep- 
aration of lycopene and a process for the preparation of a food or feed composition characterized therein that after such 
a process has been effected the carotenoid, preferably lycopene or carotenoid mixture, preferably a lycopene compris- 

30 ing mixture is added to food or feed. '• - ' * v , ;^ . - ' 

Furthermore a DNA sequence comprising the following DNA sequence is also an object of the present invention: 

a) a DNA sequence which ehcodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) ora DNA sequence 
which is substantially homologous; : ... 

36 

b) la DNA sequence Which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; j • . • , - . . 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 

'40 sequence which is substantially homologous; and - ■ 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534;(crtY) or a DNA sequence 
which is substantially homologous. 

45 ■ It is also*an object of the present invention to provide a vector comprising such DNA sequence, preferably in;the 
form of an expression vector. Furthermore it is an object of the present invention to provide a.cell which is transformed 
by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is'an eukar-yotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 

50 a mixture of carotenoids by cuituring such a cell under suitable conditions and isolating the desired carotenoid or a mix- 
ture of carotenoids from such cells or the culture medium and. in case only one carotenoid is desired separating it.by 
methods known in the art from other carotenoids which might also be present, preferably such a process for the prep- 
aration of p-carotene and a process for the preparation of a food or feed composition characterized therein that after 
such a process has'been effected the carotenoid, preferably p-carotene or carotenoid mixture, preferably a p-carotene 

55 comprising mixture is added to food or feed. . i.. 

Furthermore a cell which is transformed by the above mentioned DNA sequence comprising subsequences a) to 
d) or the vector cdmprisingit and a second DNA sequerice which encodes the p-carotene p4-oxygenase of Alcaligenes 
strain PC-1 (crt W) or a DNA sequence which is substantially homologous or a second vector which comprises a DNA 
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sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (art W) or a DNA sequence which is 
substantially homologous; and a process for the preparation of a desired carotenoid or a mixture of carotenoids by cul- 
turing such cell under suitable conditions and isolating the desired carotenoid or a mixture of carotenoids from such 
cells or the culture mediuni and, in case only one carotenoid is desired separating it by methods known in the art from 
5 other carotenoids which might also be present, preferably such a process for the preparation of echinenone and a proc- 
ess for the preparation of a food or feed connposition characterized therein that after such a process has been effected 
the carotenoid, preferably echinenone or carotenoid mixture, preferably an echinenone comprising mixture is added to 
food or feed. 

Furthermore it is an object of the present invention to provide a DNA sequence as mentioned above comprising 
10 subsequences a) to d) and a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 
(crt W) or a DNA sequence which is substantially homologous and a vector comprising such DNA sepuence, preferably 
in the form of an expression vector. Furthermore it is an object of the present Invention to provide a cejl which is trans- 
formed by such DNA sequence oi- vector, preferably which is a prokaryotic cell and more preferably , which js.E. colj or 
a Bacillus strain. Such' transformed cell which is an eukaryotic cell, preferably a. yeast , cell ot: a fungal cell, is also an 
15 object of the present invention. Finally the present invention. concerns also a process for the, preparation of a desired 
carotenoid or a mixture of carotenoids by culturing such a cell under suitable culture conditions and isolating the desired 
carotenoid or a mixture of carotenoids from such cells of the culture medium and, in case pnjy one carotenoid is desired 
separating it by methods known in the art from other: carotenoids which might also be present, especially such a prpc: 
ess for the preparation of echinenone or canthaxanthin and a process for the preparation of a food or feed compositing 
20 characterized therein that after such a process has been effected the carotenoid,^preferabIy echinenone. or canthaxan- 
thin or carotenoid mixture, preferablyra echinenone or canthaxanthin containing mixture is, added to food or feed.. 
' Furthermore a DNA sequence cornprising the following DNA sequences is also an object. of the present, invention: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium-sp. R1534 (crtE) or a DNA sequence 
25 which'is substantially homologous; • • • i 

b) a DNA sequence which encodes the prephytoene synthase of Flavobactenum sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous: . . . ; 

30 c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp..R1534 (crti) or a DNA 
sequence which is substantially homologous; . • , . . 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1 534 (crtY) or a DNA sequerice 
which is substantially homologous; and (:-.'. i :. r ♦ . . ' . ..^ 

35 ■ • ••■ • " • ^ . • 

e) a DNA sequence which encodes the p-carotene hydroxylase of Flavobacterium sp. R1534 (crtZ) or a DNA 
sequence which is substantially, honriologous. . f .f » . . ■ . .... 

It is also an.object:of the present invention to provide a vector comprising ^such DNA sequence, preferably \n Xhe, 

40 form of an expression vector. ^Furthermore it is an object of , the present invention to provifde a cell which is transforrned 
by such DNA sequence or vector; preferably which is a prokaryotic cell and.rriore preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably.a yeast cell or.a fungal cell, is alsojari object of, the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or, 
mixture of carotenoids by culturing such a cell under suitable conditions and isolating the^desired (carptenoid or a mix- 

45 ture of carotenoids from such cells of the culture medium and, in case only one carotenoid is desired separating it by 
methods known in the art from other carotenoids which might also be. present, preferably such a process for the prep: * 
aratton of .zeanxanthin and a process for the preparation of a food or feed composition characterized .therein that after 
such a process:has been effected the carotenoid, preferably zeaxanthjn or. the carotenoid mixture, preferably.a zeaxan-. . 
thin containing'mixture'is added to food or feed. , > ^ • r * ■ 

so Furthermore a DNA- sequence as mentioned above comprising subsequences a) to e),and.iri,addition a DNA 
sequence which encodes the p'-carotene p4-oxygenase of Alcaligenes strain PC:1 (crt W) of a DfslA sequence which is 
substantially homologous is an object of the present inventiqn and to provide a vector.comprising^such DNA.sequence. 
preferably in form of an expression- vector. Furthermore it is an object of. the present Invention to provide a cell which is 
transformed by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli 

55 or a Bacillus strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an 
object of the present inventions Finally the present invention concerns also a process for the preparation of a desired, 
carotenoid or a mixture of carotenoids by culturing such a cell under suitable culture conditions and isolating the desired 
carotenoid or a mixture or carotenoids from such cells of the culture medium and, in case only one carotenoid is desired 
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separting it by methods known in the art from other carotenoids which might also be present, preferably such a process 
for the preparation of zeaxanthin, adonixanthin or astaxanthin and a process for the preparation of a food or feed com- 
position characterized therein that after such a process has been effected the carotenoid. preferably zeaxanthin, adon- 
ixanthin or astaxanthin or carotenoid mixture, preferably a zeaxanthin, adonixanthin or astaxanthin containing mixture 

5 is added to food or feed. 

Furthermore a cell which is transformed by the DNA sequence mentioned above comprising subsequences a) to 
e) or a vector comprising such DNA sequence and a second DNA sequence which encodes the p-carotene p4-oxyge- 
nase of Alcaligenes strain PC-1 (crt W) or a DNA sequence which is substantially homologous or a second vector which 
comprises a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crt W) or a DNA 

10 sequence which is substantially homologous is also an object of the present invention and a process for the preparation 
of a desired carotenoid or a mixture of carotenoids by culturing such a cell under suitable conditions and isolating the 
desired carotenoid or a mixture of carotenoids from such cells or the culture medium, and in case only one carotenoid 
is desired separating it by methods known in the art from other carotenoids which might also be present, preferably 
such a process for the preparation of zeanxanthin or adonixanthin and a process for the preparation of a food. or .feed 

15 composition characterized therein that after such a process has been effected the carotenoid, preferably zeaxanthin or 
adonixanthin or carotenoid mixture, preferably a zeaxanthin oi- adonixanthin containing mixture is added to food or feed. 

Furthermore it is an object ot the present invention to provide the DNA sequences and vectors as specified before 
and a process for the preparation of a food or feed composition characterized therein that after a. process as specified 
before' has beeri effected the carotenoid prepared by such process is added to food or feed. 

20 In this context it should be mentioned that the expression "a DNA sequence is substantially homologous" refers 
with respect to the crtE encoding DNA sequence to a DNA sequence which encodes an amino acid sequence which ; 
shows more than 45 %;*preferably more than 60 % and more'preferably more than 75% and most preferably more than 
90 % identical amino acids when compared to the amino acid sequence of crtE of Flavobacterium sp. 1534 and is the 
amino acid sequence of a polypeptide which shows the same type of enzyrhatic activity as the enzyme encoded by crtE 

25 of Flavobacterium sp. 1534. In analogy with respect to crtB this means more than 60 %. preferably more than 70 %, 
more preferably more than 80 % and most preferably more than 90 %; with respect to crt I this means more than 70 %, 
preferably more than 80 % and most preferably more than 90 %; with respect to crtY this means 55 %, preferably.70 %. 
more preferably 80 % and most preferably 90 %. 

"DNA sequences which are substantially homologous" refer with respect to the crW^^se encoding DNA sequence 

30 to a DNA sequence which encodes an amino acid sequence which shows more than 60%; preferably more than 75% 
and most preferably more than 90% identical amino acids when compared to the amino acid. sequence of crtW of the 
microorganism E 396 (PERM BP-4283) and is the amino acid sequence of a polypeptide which shows the same type 
of enzymatic activity as the enzyme encoded by cirtW of the microorganism E 396. In analogy .with respect to crtZEsge 
this means more than 75%. preferable more than 80% and most preferably more than 90%; with respect to ctXE^^^q, 

35 crtBE396, crtlE396, crtYE396 and crt2E396 this means more than 80%, preferably more than 90% and most preferably 
95%. ■ . ^ - • " " • ■ * . ' ' ' 

DNA sequences in form of genomic DNA. cDNA or synthetic DNA can.be prepared as known in the art [see e.g. 
Sambrook et al., Molecular Cloning, Cold Spring Habor Laboratory Press 1989] or, e.g. as specifically described in 
Exailiples 1 , 2 or '7. In the context of the present invention it should be noted that all DNA sequences used for the proc- 

40 ess for production of carotenoids of the prieseht invention encoding crt-gene products can also be prepared as synthetic 
DNA sequences according to knowh'methods or in analogy to the method specif ically described for crtW in Example 7. 

The cloning of the DNA-sequences of the'present invention from such genomic DNA can than be effected, e.g. by 
using the well known polymerase chain reaction (PGR) method. The principles of this method are outlined e.g. in-PGR 
Protocols: A guide to'Methdds and Applications. Academic Press, Inc. (1990). PGR is an in vitro method for producing 

45 large amounts of a sjsecific DNA of defined length and sequence from a mixture of different DNA-sequences. Thereby, 
PGR is based ori'the enzymatic amplification of the specific DNA fragment of-interest which is flanked by two oligonu- 
cleotide pnmerswhich are specific for this sequence arid which hybridize to the opposite strand of . the target sequence. 
The prirhers are oriented with their 3' ends pointing toward each other. Repeated cycles of heat denaturation of the tem- 
plate, annealing of the primers to their complementary sequences and extension of the annealed primers with a DNA- 

50 polymerase result in the amplification of the segment between the PGR primers. Since the extension product of each 
primer can serve as a template for the other, each cycle essentially doubles the amount of the DNA fragment produced 
in the previous cycle: By utilizing the thermostable Taq DNA polymerase, isolated from the thermophilic bacteria Thetr 
mus aquaticus, it has been possible to avoid denatuiBtion of the polymerase which necessitated the addition of enzyme 
after each heat deniaturatibn step/ This development has led to the automation of PGR by a variety of simple tempera- 

55 ture-cycling devices. In addition, the specificity of the amplification reaction is increased by allowing the use of higher 
terhperatures for primer annealing and extension.- The increased specificity improves the overall yield of amplified prod- 
ucts by minimizing the competition by non-target fragments for enzyme and primers. In this way the specific sequence 
of interest is highly amplified and can be easily separated from the non-specific sequences by methods known In the 
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art. e.g. by separation on an agarose gel and cloned by methods known in the art using vectors as described e.g. by 
Holten and Graham in Nucleic Acid Res. IS. 1 156 (1991), Kovalic et. al. in Nucleic Acid Res. 12. 4560 (1991). Marchuk 
et al. in Nucleic Acid Res. 19, 1154 (1991) or Mead et al. in Bio/Technology 9. 657-663 (1991). 

The oligonucleotide primers used in the PGR procedure can be prepared as known in the art and described e.g. in 
5 Sambrook et al.. s.a. 

Amplified DNA-sequences can than be used to screen DNA libraries by methods known in the art (Sambrook et al., 
s.a.) or as specifically described in Examples 1 and 2. • ; . 

Once complete DNA-sequences of the present invention have been obtained they can be used as a guideline to 
define new PGR primers for the cloning of substantially homologous DNA sequences from other.sources. In addition 

10 they and such homologous DNA sequences can be integrated into vectors by methods known in the art and described 
e.g.' in Sambrook et al. (s.a.) to express or overexpress the encoded polypeptide(s) in appropriate host systems. How- 
ever, a man skilled in the art knows that also the DNA-sequences themselves can be used to transform the suitable host 
systems of the invention to gist overexpressioh of the encoded polypeptide. Appropriate host systems are: for example 
Bacteria e.g.'EJ coli, Bacilli'as, e.g. Bacillus subtilis or Flavobacter strains. Ercoli, which could be used are E. coli K12 

IS strains e.g. Ml 5 [described as DZ 291 by ViMarejo et al. in J. Bacteriol. 120, 466-474 (1974)]. HB 101 [ATC.G No. 33694] 
or E. coli SGI 3009 [Gottesman et al.. J. Bacteriol. 148. 265-273 (1981)]. Suitable Flavobacler strains can be obtained 
from any of the culture collections known to the man skilled in the art and listed^e.g. in the journal "Industrial Property" 
(January 1994, pgs 29-40), like the American Type Culture Collection (ATCC) or the Centralbureau voor Schimmelkul- 
tures (CBS) and are, e.g. Flavobacterium sp. R 1534 (ATCC No. 21588.>;classified as unknown bacterium; or as CBS 

20 519.67) or all Ravobacter strains listed as CBS 517.67 to CBS 521.67 and CBS 523.67 to CBSr525.67. especially R 
1533 (which is CBS 523.67 or ATCC 21081 . classified as unknown bacterium; see also USP 3.841 ,967). Further Fla- 
vobacter strains are* also described in WO 91/03571. Suitable . eukaryotic host systems are for example fungi, like 
Aspergilli e.g. Aspergillus niger [ATCC 9142] or yeasts, like Saccharomyces. e.g. Saccharomyces cerevisiae or Pichia, 
like pastoris. all available from ATCC. 

25 Suitiable Vectors which can be used for expression in E. coli are mentioned, e.g.. by Sambrook et al. [s.a.] or by Fiers 

et al. in Procd. 8th Int. Biotechnology Symposium" [Soc. Franc, de Microbiol., Paris (Durand et al.. eds.). pp. 680-697 
(1 988)] or by Bujard et al. in Methods in Enzymology, eds. Wu and Grossmann, Academic.Press. Inc.^ Vol. 155. 41 6-433,» 
(1987) and Stuber et al. in Immunological Methods, eds. Lefkovits and Pernis, Academic Press. Inc., Vol.: IV, 121-152 
(1990). Vectors which could be used for expression in Bacilli are known in the art and described, e.g. in EP 405 370. 

30 EP 635 572 Procd. Nat.. Acad. Sd. USA gL 439 (1984) by Yansura and Henner, Meth. Enzym: 185, 1 99-228 (1990). or 
EP 207 459. Vectors which can be used for expression in fungi are known in the art arKi described e.g. in EP 420 358 
and for yeast in EP 1 83 070, EP 1 83 071 . EP 248 227, EP 263 311. Vectors which can be used for expression in Fla- 
vobacter aire known in the art and described in the Examples or. e.g: in Plasmid Technology, (edt. by. J. Grinsted and P.M. 
Bennett; Academic Press (1990). ": -^ • ^ ^ t ' . i • . ^ , • ^ i5 . - : 

35 ; Orice sCjch' DNA-sequences- have been expressed in an appropriate host cell in a suitable medium.the.carotenoids 
can be isolated either from the rhedium in the case they are secreted into the medium or from the host organism and; 
if necessary separated from other carotenoids.if present in case one specific carotenoid is desired^by methods known 
in the art (see' e.g.v Ciarotenoids vVd lA: Isolation* and -Analysis, G.; Britlon. S. Liaaen-Jensen, H. Pfander; 1995, 
Blrkhduser Verlag, Basel). r*..;. 

40 The carotenoids of the present invention can be used in a process for the preparation of food or feeds. A man 
skilled in the art is familiar with such process. Such compound foods or feeds can further comprise additives or compo- 
nents generally used for such purpose and known in the state.'of the art;.i • - • 

After the invention has been desaibed in genera! hereinbefore, the following figures and examples are interxled to 
illustrate details of the invention, without thereby limiting it in any matter. 

45 

Figure 1 : The biosynthesis pathway for the formation or carotenoids di Flayobactehum\sp. .R 1534 is illustrated 
explaining the enzymatic activities which are encoded by DNA sequences of the present invention. 

Figure 2 : Southern blot of genomic Flavobacterium sp. R1534 DNA digested with the restriction enzymes shown 
50 • on top of each lane and hybridized with Probe 46F. The arrow indicated the isolated 2.4 kb Xhol/PstI frag- 

ment. 

Figure 3 : Southern blot of genomic Flavobacterium sp. R1 534 DNA digested with Clal or double digested with Clal 
and Hindlll. Blots shown in Panel A and B were hybridized.to probe A or probe B. respectively (see exam- 
55 pies). Both Clal/Hindlll fragments of 1 .8 kb and 9.2 kb are indicated. 

Figure 4 : * Southern blot of genomic F/a\/obacfer/t;/7? sp. *R1534 DNA digested with the restriction enzymes shown 
on top of each lane and hybridized to probe C. The isolated 2.8 kb Sal l/Hindlll fragment is shown by the 
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arrow. 



Figure 5 : Southern blot of genomic Flavobacterium sp. R1534 DNA digested with the restriction enzymes shown 
on top of each lane and hybridized to probe D. The isolated Bcll/SphI fragment of approx. 3 kb is shown 
5 by the anrow. 

Fioure 6 : Physical map of the organization of the carotenoid biosynthesis cluster in Flavobacterium sp. R1534, 
deduced from the genomic clones obtained. The location of the probes used for the screening are shown 
as bars on the respective clones. . 
10 . ••: . • . • - • * .■ . . 

Figure 7 : - Nucleotide sequence of the Flavobacterium sp. R1534 carotenoid biosynthesis cluster and its flanking 
regions. The nucleotide sequence is numbered from the first nucleotide shown (see BamHI site of Fig. , 
6): The deduced amino acid sequence of the ORF's (orf-5, orfrl , crtE, crtB, crti, crtY, crtZ and.prf-1 6) are. 
shown with the single-letter amino acid coder Arrow (-->) indicate the. direction of the transcription; aster- . 
IS isks, stop cbdons. i- ' ; , v-j- ■ • ' - , . ; 

Figure 8 : ' Protein sequence of the GGPP synthase (crtE) of Flavobacterium sp. R1534 with a MW of 31331 Da. . 

Figure 9 : Protein sequencie of the-prephytoene synthetase (crtB) of Flavobacterium sp. ■R1534 with a MW of 
20 32615^Da.^" ' ' • i ' ' • • • .r- • • ! . 

Fioure 10 : Protein sequence of the phytoene.desaturase (crtI) of ' Flavobacterium sp. R1534 with a MVV.of 5441 1 

25 Fioure 11 : Protein sequence of the lycopene cyclase (crtY) of Flavobacterium.sp. R1534 with a MVV. of *42368 Da. 
Figure 12 : Protein sequence of the p-carotene hydroxylase (crtZ) of Flavobacterium sp.. R1 534 with a MVy^ of 1 9282 

'Da." : . . ..' ; . - : ■ ..^o! " . ' 

30 Figure 13 : Recombinant plasmids containing deletions of the Flavobacterium sp. R1534 carotenoid biosynthesis 
-gene cluster. ■ * . ' ■ . - i-.r^ .: . •; .jr.. t.^A . ■ : 

Figure 14 : ' - i^ Primers used for PCR reactions. .The underlined.sequence is the. recognition site of ithe indicated restric: _ 
tion enzyme. Small caps indicate nucleotides introduced by mutagenesis. Boxes show the artifjcial.RBS 
35 'Which is- recognized'in B. subtil is.! Small caps.in bold show the location of. the original- adenine creating 

the translation. start site (ATG) of the following gene (see original operon). All the ATG's.of.the original 
■ ' ' < ' Flavobaderc^rotenoid biosynthetic genes had to be destroyed to not interfere .with the rebuild transcripr 
tion start site; Arrows indicate stai't and eridS'Of the indicated Flavobacterium .R153^^ WT carotenoid 
genes. '''-i. -rv )V 

40 ......... •. ■ • . ' • . i ^ . • , ■ > ^ i' . • v i • . i • 

Figure 15 : ' Linkers used. for the different constructions.' The underlined sequence is the recognition site of the indi- 
cated restriction enzyme. Small caps indicate nucleotides introduced by synthetic primers.. Boxes show, 
the artificial RBS which is recognized in. B. subtilis. Arrow indicate start and ends of the indicated Flavo- 
bacterium carotenoid genes. .; :i 

Costruction of plasmids pBllKS(+)-clone59-2. pLyco and pZea4. 

Construction of plasmid p602CAR, 

. Construction of plasmids pBIIKS(+)-CARVEG-E and p602 CARVEG-E. 

Construction of plasmids pHP13-2CARZYIB-EINV and pHP13-2PN252YIB-EINV. 

Construction of plasmid pXI12-2YIB-EINVMLrrRBS2C. 

Fioure 21 : Norhern blot analysis of B. subtilis strain BS1012::ZYIB-EINV4. Panel A: Schematic representation of a 
reciprocal integration of plasmid pXI12-2YIB-EINV4 into the levan-sucrase gene of B-subtilis. Panel B: 
Northern blot obtained with probe A (PCR fragment which was obtained with CAR 51 and CAR 76 and 



45 

Figure 16 :- 
Figure 17 : 
so Figure 18 : 
Figure 19: 
Fioure 20 : 

55 
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hybridizes to the 3* end of crtZ and the 5' end or crtY). Panel C: Northern blot obtained with probe B 
(BamHI-Xhol fragment isolated from plasmid pBIIKS{+)-crtEy2 and hybridizing to the 5* part of the crtE 

gene). 



5 Figure 22 : Schematic representation of the integration sites of three transformed Bacillus subtilis strains: 
BS1012::SFCO. BS1012::SFCOCAT1 and BA1012::SFCONEO1. Amplification of the synthetic Flavo- 
bacterium carotenoid operon (SFCO) can only be obtained in those strains having amplif iable structures. 
Probe A was used to determine the copy number of the integrated SFGO. Erythromycine resistance 
gene (ermAM), chloramphenicol resistance gene (cat), neomycine resistance gene (neo), terminator of 

10 the cryT gene of B. subtilis (cryT), levan-sucrase gene (sac-B 5* and sac-B 3').v plasmid sequences of 

pXI12 (pXh2). promoter originating from site I of the veg promoter complex (Pvegl). 

Fioure 23 : Construction of plasmids pXI12-ZYIB-EINV4MUTRBS2GNEO.and pXI12-ZYIB-EINV4MUTRBS2CCAT. 

IS Figure 24 : Complete nucleotide sequence of plasmid pZea4. t : 

Fioure 25 : Synthetic crtW gene of Alcaligenes PC-1. The translated protein sequence is shown, above the double 
stranded DNA sequence. The twelve oligonucleotides: (crtW1-crtW1 2) used for the PGR synthesis-are 

underiined. " . - • - : • . \ * • . . - -. 

Fioure 26 : Construction of plasmid pBIIKS-crtEBIYZW; The Hindlll-Pml I fragment of •pALTER-Ex2-crtW, carrying 
the synthetic crtW gene, was cloned into the Hindlll and Mlul (blunt) sites. Pvegl and^Rtac are the pro- 
moters used for the transcription of the two opera. The ColEI replication origin of this plasmid is compat- 
ible, with the p15A origin present in the pALTER-Ex2 constructs. 

25 ■ .'■>.:. I . • • • . • ; 

Fioure 27 : Relevant inserts of all plasmids constructed in Example 7, Disrupted genes are shown by //. Restriction 
. sites: S=Sacl. X=Xbal. H=:Hindlll..N=Nsil. Hp=Hpal. Nd^Ndel. . - . ■ 

Fioure 28 : Reaction products (carotenoids) obtained from p-carotene by the process of the present invention. 



Example 1 



IVIaterials and general methods used 

35 Bacterial strains and plasmids: Flavobacterium sp. R1 534 WT (ATCC 21 588) was the DNA source for the genes 
cloned. Partial genomic libraries of Flavobacterium sp. R1534 WT DNA were constructed into the pBluescriptll+{KS) or 
(SK) vector (Stratagene, La Jolla. USA) and transformed into E. colt XL-1 blue (Stratagene) or JM109. .a.; - 

Media and growrth:conditlons:.Jransformed E. co// were grown in Lurja broth (LB)-,at 37** C with.lOOmg Ampicillin 
(Amp)/ml.for selection: F/avobac/er/L/m sp. R1534 WT was grown at 27° C in medium containing 1% glucose. .1% tryp- 
40 tone (Difco Laboratories), 1% yeast extract (Difco),0.5%,MgSO4 7H2O and 3% NaCI: 

Colony screening: Screening of ^th e E. co// transformants was done by PGR basically according to the method 
described by Zon et al. [Zon et al., BioTechniques 7. 696-698 (1989)] using the following primers: 

Primer #7: •5:-CGTGGATGAGGTGGTGGAATATTCG-3'. : .. 

45 Primer #8: 5'-eAAGGGeCAGATCGCAGGGG-3* . . . . . 

Genomic DNA: A 50 ml overnight culture of Flavobacterium sp. R1534 yNas centrifuged.at 1 0.000 g for 1 0 minutes: 
The pellet was washed briefly with 10 ml of lysis buffer (50 mM EDTA. 0.1M NaCI pH7i5), resuspended iri 4 ml of.the 
same buffer sumplemented with »10 mg of.lysozyme and incubated at 37°C for 1 5 minutes. After addition of 0.3 ml.of N- 
50 LauroyI sarcosine (20%) the incubation at 37°G was continued for another 15 minutes before the extraction of the DNA 
with phenol, phenol/chloroform and chloroform. The DNA was ethanol precipitated at room temperature for 20 minutes 
in the presence of 0.3 M sodium acetate (pH 5.2). followed by centrifugation at 10.000 g forlS minutes,.The pellet vyas 
rinsed with 70% ethanol. dried and resuspended in 1 ml of TE (10 mM Tris, ImM EDTA. pH 8.0), 

All genomic DNA used in the southern blot analysis and cloning experiments was dialysed against H2O for 48' 
55 hours, using collodium bags (Sartorius, Germany), ethanol precipitated in the presence of- 0.3 M sodium acetate and 
resuspended in H2O. / • . • ... • r . • - .r.^.^ • . -..oii • 

Probe labelling: DNA probes were labeled with (a - ^.^P) dGTP (Amersham) by random-priming according to 
[Sambrook et al., s.a.]. 
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Probes used to screen the mini-libraries: Probe 46F Is a 119 bp fragment obtained by PGR using primer #7 and 
#8 and Fiavobacterium sp. R1534 genomic DNA as template. This probe was proposed to be a fragment of the Flavo- 
bacterium sp, R1534 phytoene synthase (crtB) gene, since it shows significant homology to the phytoene synthase 
genes from other species (e.g. E. uredovora, E. herb'tcola). Probe A is a BstXI - PstI fragment of 184 bp originating from 

5 the right arm of the insert of clone 85. Probe B is a 397 bp Xhol - NotI fragment obtained from the left end of the insert 
of clone 85. Probe C is a 536 bp Bglll - PstI fragment from the right end of the insert of clone 85. Probe D is a 376 bp 
Kpnl - BstYl fragment isolated from the insert of clone 59. The localization of the individual probes is shown in figure 6. 

Oligonucleotide synthesis: The oligonucleotides used for PGR reactions or for sequencing were synthesized with 
an Applied Biosystems 392 DNA synthesizer. . . . ' 

70 Southern blot analysis: For hybridization experiments Fiavobacterium sp. R1534 genomic DNA (3 mg) was 
digested with the appropiate' restriction enzymes and electrophoresed on a 0.75% agarose gel. The transfer to Zeta- 
Probe blotting membranes (BIO-RAD). was done as described [Sourthern, E.M., J. Mol. Biol. 98. 503 (1975)]. Prehy- 
bridization and hybridization was in 7%SDS, 1% BSA (fraction V; Boehringer). 0.5M Na2HP04. pH 7.2 at 65*^C. After 
hybridization the membranes were washed twice for 5 minutes in 2x SSC. 1% SDS at room temperature and twice for 

75 15 minutes in 0.1% SSG. 0.1% SDS at 65° C. > . . ; 

DNA sequence analysis: The sequence was determined by the dideoxy chain termination technique [Sanger et 
aL, Proc.'Natl. Acad. Sci: USA. 74. 5463-5467! (1977)] using the Sequenase Kit (United States Biochemical). Both 
strandS'Were completely sequenced and the sequence analyzed using the GOG sequence analysis software package 
(Version 8.0) by Genetics Computer, Inc. [Devereux et al., Nucleic Acids. Res. 12. 387-395 (1984)]. . 

20 Analysis of carotenoids: E. coH XL-1 or JM109 cells (200 - 400 ml) carrying different plasmid constructs were 
grown for the times iiidicated in the text, usually 24 to 60 hours, in LB suplem'ented with.lOOmg Ampicillin/ml, in shake 
flasks at 37° G and 220rpm.' « • . ' .■ . '.: k : 

The carotenoids present in the microorganisms were extracted with" an adequate volume of acetone using a rota- 
tion homogenizer (Polytron. Kinematica AG, CH-Luzern). The homogenate was the filtered through the sintered glass 

25 of a suction filter into a round bottom flask. The filtrate was evaporated by means of a rotation evaporator at 50° C using 
a water-jet vacuurh. For the zeaxanthin detection the residue was dissolved in n-hexane/acetone (86:14) before analy^ 
sis with a normalphase HPLC as described in [Weber. S. in Analytical Methods for Vitamins and Carotenoids in Feed. 
Keller, H.E. , Editor, 83-85 (1988)]. For the detection of p-carotene and lycopene the evaporated extract was dissolved 
in n-hexane/acetone (99:1) and analysed by HPLC as described in [Hengartner et al., .Helv. Chim. Acta 75, 1848-1865 

30 (1992)]. 

Example 2 

Cloning of the Fiavobacterium sp. R1534 carotenoid biosvnthetic genes. 

To identify, and isolate DNA fragments carrying the genes of the carotenoid biosynthesis jDathway. we used the DNA 
fragment 46F (seeHnethbds). to probe a Southern Blot carrying chromosomal DNA oV Fiavobacterium ^spi R1534. 
digested with- different restriction -enzymes Fig. 2. The 2.4 kb Xhol/Pstl fragment hybridizing to the .'probe 'seemed the 
most appropiate orie to start with. Genomic Fiavobacterium sp. HI 5ZA DNA was digested with Xhol/Pstl and run on a • 

40 1% agarose gel. According to a comigrating DNA marker, the region of about 2.4 kb-was cut out of the gel and the DNA 
isolated. A Xhol/Pstl mini library oi Fiavobacterium sp. R1 534 genomic DNA was constructed into Xhol:- Pstj sites of 
pBluescriptllSK(+). One hundred. E. coll XL1 transformants were subsequentely screened by PGR with primer #7 and 
primer # 8, the same primers previously used to obtain the 1 19 bp fragment (46F). One positive transformant. named 
clone 85, was found. Sequencing of the insert revealed sequences hot only* homologous lb the?phytoeniB synthase 

45 (crtB) but also to the phytoene desaturase (crti) of both Erwinia species herbicola and uredovora. Left and right hand 
genomic sequences of clone 85 were obtained by the same approach using probe A and probe B. Fiavobacterium sp. 
R1534 genomic DNA was double digested with Clal and Hind 111 and subjectedito Southern analysis wWh probe A and 
probe'B. With probe A a Glal/Hindlll fragment of aprdx.- 1.8 kb >Aras identified (Fig: 3A). isolated and subcloned into the 
Glal/Hindlll sites of pBluescriptllKS (+). Screening of the E. coli XL1 transformants with'probe A gave 6 ppsitive clones:- 

50 The insert of one of these positives, clone 43-3, was sequenced and showed homology to the N-termihus of crtI genes 
and to the C-terminus of crtY genes of both Erwinia species mentioned above. With probe B an approx. 9.2 kb Clal/Hin- 
dlll fragment was detected (Fig. 3B), isolated and subcloned into pBluescripitllKS (-!-).'• ^ \ > ' . ^ . , 

A screening of the transformants gave one positive, clone 51. Sequencing of the 5' and S'. of the insert.* revealed 
that only the region close to.the Hindlll site showed relevant homology to genes of the carotenoid biosynthesis of the 

55 Erwinia species mentioned above (e.g." crtB gene and crtE gene). The sequence around the Glal site showed no homol- 
ogy to known genes of the carotenoid biosynthesis pathway. Based on this information and to facilitate further sequenc- 
ing and construction work, the 4.2 kb BamHI/Hindlll ffagnient of clone 51 was subcloned into the respective sites of 
pBluescriptllKS(+) resulting in clone 2. Sequencing of the insert of this clone confirmed the presence of genes homol- 
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ogous to Erwinia sp. crtB and crtE genes. These genes were located within 1 .8 kb from the Hindlll site. The remaining 
2.4 kb of this insert had no homology to known carotenoid biosynthesis genes. 

Additional genomic sequences downstream of the Clal site were detected using probe C to hybridize to Flavobac- 
terium sp. R1534 genomic DNA digested with different restriction enzymes (see figure 4). 

5 A Sall/Hindlll fragment of 2.8 kb identified by Southern analysis was isolated and subcloned into the Hindlll/Xhol 

sites of pBluescriptllKS (+). Screening of the E. coli XL1 transformants with probe A gave one positive clone named 
clone 59, The insert of this clone confirmed the sequence of clone 43-3 and contained in addition sequences homolo- 
gous to the N-terminus of the crtY gene from other known lycopene cyclases. To obtain the putative missing crtZ gene 
a SauSAI partial digestion library of Flavobacterium sp. R1534 was constructed into the BamHI site of pBIuescriptl- 

10 IKS(+). Screening of this library with probe D gave several positive clones. One transformant designated, clone 6a, had 
an insert of 4.9 kb. Sequencing of the insert revealed besid^ the already known sequences coding for crtB, crti and 
crtY also the missing crtZ gene. Clone 7g was isolated frorn a mini library carrying Bcll/SphI fragments of R1534 (Fig. 
5) and screened vAXh probe D. The insert size of clone 7g is approx. 3 kb. 

The six independent inserts of the clones described above covering approx. 14.kb of the Ftavobacterium sp. R1534 

15 genome are compiled in Figure 6. 

The determined sequence spanning from the BamHI site (position 1) to base pair 8625 is shown figure 7. 

Putative protein coding regions of the cloned R1534 sequence. 

20 Computer analysis using the Codon Preference program of the GCG package, which' recognizes . protein .coding 
regions by virtue of the similarity of their codon usage to a given codon frequency table, revealed eight open reading 
frames (ORFs) encoding putative proteins: a partial ORF from 1 toJ 165 (0RFt5) coding for a polypeptide' larger than 
41382 Da; an ORF coding for a polypeptide with a molecular weight of 40081 Da from 1 180 to:2352 (ORF-1); an ORF 
coding for a polypeptide with a molecular weight of 31 331 Da from 2521 to 3405 (crtE); an ORF coding for a polypeptide 

25 with a molecular weight of 3261 5 Da from 431 6 to 3408 (crtB); an ORF coding for a polypeptide with a molecular weight 
of 5441 1 Da from 5797 to 4316 (crtI); an ORF coding for a polypeptide with a molecular weight of 42368 Da from 6942 
to 5797 (crty): an ORF coding for a polypeptide with a molecular weight of 1 9282 Da from 7448 to .6942 (crtZ); and an 
ORF coding for.a polypeptide with a molecular weight of 1 9368 Da from 831 5 to 7770 (ORF-1 6);-ORF-1 and crtE have 
the opposite transcriptional orientation from the others (Fig. 6). The translation start sites of the ORFs crt), crtY and crtZ 

30 could clearly be determined based on the appropiately located sequences homologous to the Shine/Delgano (S/D). . 
[Shine and Dalgarno, Proc. Natl. Acad. Sci. USA 21, 1342-1346 (1974)] consensus sequence AGG-6-9N-ATG (Fig. 
10) andjthe homology to the N-terminal sequences of the respective enzymes of E. herbicola and E. urecfo\/ora.:'The 
translation of the ORF crtB could potentially) start from three closely spaced codons /^G (4316), ATG (4241) and. ATG 
. (421 1). The first one, although not having the best S/D sequence of the three, gives a translation product with the high^ 

35 est homology to the N-terminus of the E. herbicola and E. uredovora crtB protein, and is therefore the most likely trans- 
lation start site. The translation of ORF crtE could potentially start from five different start codons found within .150 bp : 
ATG (2389). ATG (2446). ATG (2473), ATG (2497) and ATG (2521). We believe that based on the following observa- 
tions,.the ATG (2521) is the most likely transcription start site of atE: this ATG start codon is preceeded by the best con- 
sensus S/D sequence of rail five putative start sites mentioned; and the putative. N-terminal amino acid sequence of the 

40 protein encoded has the highest homology to the N-terminus of the crtE enzymes of E. herbicola and E. uredovora: 

Characteristics of the crt translational initiation sites and gene products. 

The translational start sites of the five carotenoid biosynthesis genes are shown below and the possible ribosome 
45 binding sites are underlined. The genes crtZ. crtY, crt I and crtB are grouped so tightly that the TGA stop codon of the 
anterior gene overlaps the ATG of the following gene. Only three of the five genes (crtI, crtY and- crtZ) fit with.the con- 
sensus for optimal S/D sequences. The boxed TGA sequence shows the stop condon of the anterior gene. 
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-10 +7 

ACG AAGGC ACCG ATG ACQ C CCA c r t E 

5 

C GG A CCTGGC CGT CGCP^rGP\cCGATC . crtB 

CGGATCGCAA TAC P^G^CCATG c r t Y 

10 

CTGC AGGAG AGAGCi^gggGTTCCG crtl 

GCAAGGGGCCGGCATGAGCAC IT c r t Z 

75 



^ Amino acid sequences of individual crt genes of F/avo6acferiu/77 5p. R1 534. 

. All five ORFs of Flavobactehum sp. R1 534 having homology to known carotenoid biosynthesis genes of other spe- 
cies are clustered in approx. 5.2 kb of the sequence (Fig. 7). : - 

25 GGDP synthase (crtE) r . . . . . . . 

The amino acid (aa) sequence of the geranylgeranyl pyrophosphate synthase (crtE gene product) consists of 295 
aa and is shown in figiire.S: TTiis enzyme condenses farnesyl pyrophosphate and isopentenyl pyrophosphate in a 1 ' - 4. 

•..1,. ■ • . : ' • ■ ■ I." ... - . T. A \i- . 

30 Phytoene synthase (crtB) . ; • . . 

.. i ' . • r • A Jv. • ■* ■ \. ' - , V . ' ; • , . . l; ; ■ . . : _ , J ,n ,, . 

This enzyme calalyzes two enzymatic steps. First it condenses in a head to head reaction two geranylgeranyl pyrophos- 
phates (020) to:the 040 carotenoid jDrephytoene. Second It rearahges the cycjopropylring of prephyloehe to phytoene. 
The 303 aa encoded by-.the crtB gene of F/awotecfeir/umsp. R1534 is" shown in figure 9. • i. 

35 !,■..'..:. • ... i.- • : : . . ' • . < . . 

Phytoene desaturase (crt!)\' : i-: ^ • * ^ : : ; • - : . ^ . 

• - v • .>••".,.., ' r - • ^ t .. 

The phytoene' desaturase of Ffavobactenum sp. R1534 consisting of 494 aa, shown in figure lO. performs like the 
crti enzyme oi E. herbicola and E. uredovora, four desaturation steps, converting the non-coloured carotenoid phy- 
40 \oene io the red coloured \ycopene: Lycopene cyclase (crtY) ' '» . \ . . : " • ' 

The crtY gene product of Flavobacterium sp. R1534 is sufficient to introduce the b-ionone rings at both sides of 
lycopene to obtain p-carotene. The lycopene cyclase of Flavobacterium sp, R1534 consists of 382 aa (Fig. 1 1). p-car- ' 
otene hydroxylase (crtZ) 

The gene product of crtZ consisting of 1 69 aa (Fig. 1 2) and hydroxylates p-carotene to the xanthophyll zeaxanthin. 

Putative enzymatic''f unctions of the ORF-s (orff-l, orf-5 and orf-1 6] ^ . i . 

The orf-1 has at the aa level over 40% identity to acetoacetyl-CoA thiolases of different organisms (e.g. Candida 
tropicalis, human, rat). This gene is therefore most likely a putative acetoacetyl-CoA thiolase (acetyl-OoA acetyltrans- 

50 ferase). which condenses two molecules of acetyl-CoA to Acetoacetyl-CoA. Condensation of acetoacetyl-CoA with a 
third acetyl-CoA by the HMG-CoA synthase forms p-hydroxy-p-methylglutaryl-Ck)A (HMG-CoA). This compound is part 
of the mevalonate pathway which produces besides sterols also numerous kinds of isoprenoids with diverse cellular 
functions. In bacteria and plants, the isoprenoid pathway is also able to synthesize some unique products like caroten- 
olds, growth regulators (e.g. in plants gibberellins and abcissic acid) and sencodary metabolites like phytoalexins [Riou 

55 et al.. Gene 148. 293-297 (1994)]. 

The orf-5 has a low homology of approx. 30%. to the amino acid sequence of polyketide synthases from different 
streptomyces (e.g. S. violaceoruber, S. cinnamon ensis). These antibiotic synthesizing enzymes (polyketide synthases), 
have been classified into two groups. Type*l polyketide synthases are large multifunctional proteins, whereas type-ll 
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polyketlde synthases are multiprotein complexes composed of several individual proteins involved in the subreactions 
of the polyketide synthesis [Bibb, et al. Gene 142. 31-39 (1994)]. 

The putative protein encoded by the orf-16 has at the aa level an Identity of 42% when compared to the soluble 
hydrogenase subunH of Anabaena cylindrica. 

5 

Functional assignment of the ORF 's (crtE, crtB, crti, crtY and crtZ) to enzymatic activities of the carotenoid 
biosynthesis pathway. 

The biochemical assignment of the gene products of the different ORF's were revealed by analyzing carotenoid 

10 accumulation in E. coli host strains that were transformed with deleted variants of the Fiavobacterium spi gene cluster 
and thus expressed not al) of the crt genes (Fig. 13). 

Three different plasmid were constructed: pLyco, p59-2 and pZea4. Plasmid p59-2 was obtained by subcloning the 
Hindi ll/BamHI fragment of clone 2 into the Hindlll/BamHI.sites of clone 59. p59-2 carries the ORF's of the crtE, crtB, 
crtI and crtY gene and should lead to the production of p-carotene. pLyco was obtained by deleting the Kpnl/Kpnl frag- 

15 ment; coding for approx. one half (N-terminus) of. the crtY gene, from the p59-2 plasmid. :E. coli cells transformed with 
pLyco, and therefore having a truncated non-functional crtY gene, should produce lycopene. the precursor of p-caro- 
tene. pZea4 was constructed by ligation of the Ascl-Spel fragment of p59-2, containing, the crtE, crtB. crt I and most of 
the crtY gene with the AsclTXbal fragment of clone 6a, containing the sequences to complete the crtY gene and the crtZ 
gene: pZea4 [for complete sequence see Fig. 24; nucleotides 1 to 683 result from pBluescriptllKS(+), nucleotides 684 

20 to 8961 from Fiavobacterium R1 534 WT genome, nucleotides 8962 to 11 233 from pBluescriptllKS(+)] has therefore all 
five ORF*s of the zeaxanthin* biosynthesis pathway. Plasmid p2ea4 has been deposited on May 25, 1995 at the DSM- 
Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (Germany) under accession No. DSM 10012. E. 
coli cells transformed with this latten plasmid should therefore produce zeaxanthin. For the detection of the carotenoid 
produced, transformanls were grown for 48 hours in shake flasks and then subjected to carotenoid analysis as 

25 described in the methods section. Figure 13 summarizes the different inserts of the plasmids described above, and the 
main carotenoid detected in the cells. 

As expected the pLyco carrying E. co// cells produced lycopene, those carrying p59-2 produced p-carotene (all- 
E,9-Z.13-Z) and the cells having the.pZea4 construct produced zeaxanthin. This confirms that all the necessary genes 
of Fiavobacterium sp. R1534 for^the synthesis of zeaxarrthin or their precursors (phyloene, lycopene and p-carotene) 

30 were cloned. . . ... , . . 

Example 3 

Materials and methods used for expression of carotenoid synthesizing enzymes . _ 

Bacterial Strains and plasmids: The vectors pBluescriptllKS (+) or (-) (Stratagene, La Jolla. vUSA) and pUC18 
IVieira and Messing: Gene IS. 259-268 (1982); Norrander et al., Gene 26. 101 r1 06 {1983)] were used for cloning in dif- 
ferent E. CO// strains, like XH blue (Stratagene). TGI or JM109. In all B.^subtilis transformations, strain 1012 was used. 
Plasmids pHP 13 [Haima et al.; Mol. Gen, Genet. 209,. 335-342 (1987)] and p602/22 [LeGrice, S.F.J, in Gene Expres- 
40 sion Technology. Goeddel. D.V.,- Editor, 201-214 (1990)] are Gram (+)/(-) shuttle vectors able to replicate in 0. subiilis 
and E. co// cells. Plasmid p205 contains the vegl promoter cloned into the Smal site of pUC18i Plasmid pXh 2 is an inte- 
gration vector for the constitutive expression of genes in B. subiilis [Haiker et al., in 7th Int. Symposium on the Genetics 
of IndustriaKMicroorganisms. June 26-July 1. 1994. Mongreal. Quebec. Canada (1994)]. Plasmid pBESTSOl [Itaya et 
al.. Nucleic Acids Res. 17 (1 1), 4410 (1989)] contains the neomycin resistance gene cassette originating from the plas- 
ms mid pUB1 1 0 (GenBank entry: M19465) of S. aureus [McKenzie et al., Plasmid 15, 93-1 03 (1986); McKenzie et al., Plas- 
mid 11. 83-84 (1987)]. This neomydn gene has been shown to work as a selection marker when present in a single 
copy in the genome of B. subiilis. Plasmid pC194 (ATCC 37034) (Gen Bank entry: L08860) originates from S. aureus 
[Horinouchi and Weisblaum. J. Bacteriol. 15Q. 815-825 (1982)] and contains the chloramphenicol acetyltransferase 
gene. • • • :: , ■ , , . 

so Media and growth conditions: E. coli were grown jn Luria broth (LB) at 37^ C with lOOmg Ampicillin (Amp)/ml for 

selection., a subiilis cells were grown m VY-medium:supplemented with either erythromycin (1 mg/ml). neomycin (5- 
180 mg/ml) or chloramphenicol (10-80 mg/hril)i . ...... : v 

Transformation: E. coli transformations were done by electroporation using the Gene-pulser device of BIO-RAD 
(Hercules, CA, USA) with the following parameters (200 W, 250.mFD, 2.5V). ast;£>////s transformations were done basi- 
55 cally according to the standard procedure method 2.8 described by [Cutting and Vander Horn In Molecular Biological 
Methods for Bacillus. Harwood, C.R. and Cutting, S.M., Editor, John Wiley & Sons: Chichester. England. 61-74 (1990)].^ 
Colony screening: Bacterial colony screening was done as described by [2on et al.. s.a.]. 
Oligonucleotide synthesis: The oligonucleotides used for PCR reactions or for sequencing were synthesized with 
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an Applied Biosystems 392 DNA synthesizer. 

PGR reactions: The PGR reactions were performed using either the UlTma DNA polymerase (PerWn Elmer 
Getus) or the Pfu Vent polymerase (New England Biolabs) according to the manufacturers instructions. A typical 50 
ml PGR reaction contained: 100ng of template DNA, 10 pM of each of the primers, all four dNTP*s (final cone. 300 mM). 

5 MgCl2 (when UlTma polymerase was used; final cone. 2 mM). 1x UlTma reaction buffer or 1x Pfu buffer (supplied by 
the manufacturer). All components of the reaction with the exception of the DNA polymerase were incubated at 95°G 
for 2 min. followed by the cycles indicated in the respective section (see below). In alt reactions a hot start was made, 
by adding the polymerase in the first round of the cycle during the 72*^C elongation step. At the end of the PGR reaction 
an aliquot was analysed on 1% agarose gel, before extracting once with phenol/chloroform. The amplified fragment in 

10 the aqueous phase was precipitated with 1/10 of a 3M NaAcetate solution and two volumes of Ethanol. After centrifu- 
gation for 5 mln. at 12000 rpm, the pellet was resuspended in an adequate volume of HgO, typically 40 ml, before diges- 
tion with the indicated restriction enzymes was performed. Afier the digestion the mixture was separated on a 1% low 
melting point agarose. The PGR product of the expected size were excised from the agarose and purified using the 
glass beads method (GENECLEAN KIT. Bio 101. Vista CA. USA) when the fragments were above 400 bp or directly 

15 spun out of the gel when the fragments were shorter than 400 bp as described by [Heery et a!.; TIBS 6 (6), 1 73 (1990)]. 

Oligos used for gene amplification and site directed mutagenesis: 

All PGR reactions performed to allow the construction of the different plasmids are described below. All the primers 

20 used are summarized in figure 14.. • . . ^' ' 

Primers #100 and #101 were used in a PGR reaction to amplify the complete crtE gene having a Spel restriction 
site and an artificial ribosomal binding site (RBS) upstream of. the transcription start site of this gene: At the 3- end of 
the amplified fragment; two unique restriction sites were introduced, ian Avrll and a Smal site, to facilitate the further 
cloning steps. The PGR reaction was done with UlTma polymerase using the following conditions for the amplification: 

25 5 cycles with the profile: 95°G, 1 min./ SO^'G, 45 sec:/ 72^G. 1 min. and 20 cycles with the profile: 95°G. 1 min./ 72**G, 1 
min.. Plasmid pBllKS(-f)'Clone2 served as template DNA. The final PGR product was digested with. Spel and Smal and 
Isolated using the GENEGLEAN KIT. The size of the fragment was approx. 910 bp. ; i- . • ' / ' ( ' . 

Primers #104 and #105 were used in a PGR reaction to amplify, the crtZ gene from the translation start till the Sail 
restriction site, located in the coding sequence of this gene. At the 5* end of the crlZ gene an EcoRI. a synthetic RBS 

30 and a Ndel site was introduced. The PGR conditions were as described above. Plasmid pBIIKS(+)-ctone 6a served as 
template DNA and the final PGR product was digested with EcoRl and Sail. Isolation of the fragment of approx. 480 bp 
was done with the GENEGLEAN KIT . 

Primers MUT1 and MUT5 were used to amplify the complete crtY gene. At the 5' end, the last 23 nucleotides of the 
crtZ gene including the Sail site are present, followed by an:airtif Icial RBS preceding the translation start site of the crtY 

35 gene. The artificial RBS created includes a Pmll restriction site. The 3* end of the amplified fragment contains 22 nucle- 
otides of the crtl gene, preceded by an newly created artifial RBS vvhich contains a Muni restriction site. The conditions 
used for the PGR reaction were as described above using the following-cycling profile: 5 rounds of 95°G, 45 sec./60^G, 
45 sec./ 72PG. 75 sec; followed by -22 cycles with the profile:. 95^G. 45 sec.7 66?C; 45 sec./ 72°C. 75 sec.: Plasmid 
pX112-ZYIB-EINV4 served as template for the Pfu Vent polymerase. The PGR product of 1225 bp was made blunt and ' 

40 cloned into the Smal site of pUGiS, using the Sure-Glone Kit (Pharmacia) according to the manufacturer. 

Primers MUT2 and MUT6 were used to amplify the complete crtl gene. At the 5' the last 23 nucleotides of the crtY 
gene are present, followed by an artificial RBS which precedes the translation start site of the crtl gene. The new RBS 
created, includes a Muni restriction site. The 3' end of the arhplified fragment contains the artificial RBS upstream of the 
crtB gene including a BamHI restriction site. The.'conditions used for the PGR reaction were basically as described 

45 above Including the following cycling profile: 5 rounds of 95°G, 30 sec7 60°G, 30'sec.'/ 72°G.-75 sec, followed by 25 
cycles with the profile: 95°G. 30 sec./ 66°G, 30 sec./ 72°G, 75 sec. Plasmid pXI12-ZYlB^EINV4 served as template for 
the Pfu Vent polymerase. For the further cloning steps the PGR product of 1541 bp was digested with Muni and BamHI.. 

Primers MUT3 and CAR17 were used to amplify the N-terminus of the crtB gene. At the 5' the last 28 nucleotides 
of the crtl gene are present followed by an artificial RBS, preceding the translation start site of the crtB gene. This new 

so created RBS," includes a BamHI restriction site. The amplified fragment, named PGR-F contains also the Hindi II restric- 
tion site located at the N-terminus of the crtB gene. The conditions used for the PGR reaction were as described else- 
where in the text, including the following cycling profile: 5 rounds of 95**G, 30 sec./ 58^G. 30 sec./72^G, 20 sec followed 
by 25 cycles with the profile: 95°G. 30 sec/ 60°C. 30 sec/ 72°C. 20 sec. Plasmid pXII 2-ZYI B-E IN V4* served as tem- 
plate for the Pfu Vent polymerase. The PGR product of approx. 160 bp was digested with BamHI and Hindlll. 

55 - • J • ,r* ^ ..... 

Oligos used to amplify the chloramphenicol resistance gene (cat): 

Primers CATS and CAT4 were used to amplify the chloramphenicol resistance gene of pGl94 (ATGG 37034) [Hori- 
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nouchi and Weisblum, s.a,] a R-plasmid found in S. aureus. The conditions used for the PGR reaction were as 
described previously including the following cycling profile: 5 rounds of 95°C. 60 sec./ 50^C. 60 sec./ 72°C. 2 min. fol- 
lowed by 20 cycles with the profile: 95°C, 60 see./ 60°C, 60 sec./ 72°C. 2 min.. Plasmid pC198 served as template for 
the Pfu Vent polymerase. The PGR product of approx. 1050 bp was digested with EcoRI and Aatll. 

5 Ollgos used to generate linkers: Linkers were otMained by adding 90 ng of each of the two corresponding primers 

into an Eppendorf tube. The mixture was dried In a speed vac and the pellet resuspended in 1x Ligation buffer (Boe- 
hringer, Mannheim. Germany). The solution was incubated at 50°G for 3 min. before cooling down to RT, to allow the 
primers to hybridize properly. The linker were now ready to be ligated into the appropriate sites. All the oligos used to 
generate linkers are shown in figure 15. 

70- Primers CS1 and CS2 were used to form a linker containing the following restrictions sites Hindi II, Aflll. Seal, Xbal» 
Pmel and EcoRI. * 

Primers MUT7 and MUT8 were used to form a linker containing the restriction sites Sail. Avrll, Pmll, Mlul, Muni. 
BamHIi SphI and Hindlll. . 

Primers MUT9 and MUT10 were used to introduce an artificial RBS upstream of crtY. 

75 Primers MUT11 and MUT12 were used to introduce an artificial RBS upstream of crtE. 

Isolation of RNA: Total RNA was prepared from log phase growing 8. subtilis according to the method described 
by [Maes and Messens, Nucleic Acids Res. 20 (16), 4374 (1992)]. . 

Northern Blot analysis: For hybridization experiments up to 30 mg of B. subtilis RNA was electrophoreses on a 
1% agarose' gel made up in 1x MOPS arxJ 0.66 M formaldehyde. Transfer to Zeta-Probe blotting membranes. (BIO- 

20 RAD), UV cross-linkirig, pre-hybridization and hybridization was done as described elsewhere in [Farrell, J.R.E:; RNA 
Methodologies. A laboratory Guide for isolation and characterization. San Diego, USA: Academic Press (1993)]. The 
washing; conditions used were: 2 x 20 min. in 2xSSPE/0.1% SDS followed by 1 x 20 min. in 0.1% SSPE/0.1% SDS at 
65°C. Northern blots were then analyzed either by a Phosphorimager (Molecular Dynamics) or by autoradiography on 
X-ray films from Kodak. - . . 

25 Isolation of genomic DNA: B. subtilis genomic DNA was isolated from 25 ml overnight cultures according to the 
standard procedure method 2.6 described by [13]. 

Southern blot analysis: For hybridization experiments B. -subtilis genomic DNA (3 mg) was digested with the 
appropriate restriction enzymes and electrophoresed on a 0.75% agarose gel. The transfer to Zeta-Probe blotting 
membranes (BIO-RAD), was done as described [Southern, E.M., s.a.]. Prehybridization and hybridization was in 

30 7%SDS, i1% BSA (fraction Vf Boehringer), 0.5M fNla2HP04. pH f.sTat 65°C, After hybridization the rnembranes were 
washed twice for 5 min. in 2x SSC, 1%*SDS at room temperature and twice for 15 min. In 0..1% SSG. 0.1% SDS at 65** 
G. Southern^ blots.were then analyzed either, by a Phosphorimager. (Molecular Dynamics) on by autoradiography on X- 
ray films from Kodak. . « : . 

DNA sequence analysis: The sequence was determined by the dideoxy chain termination technique [Sanger et 

35 al., s.a.] using the Sequenase Kit Version 1.0 (United States Biochemical). Sequence analysis were done using the 
GGG sequence analysis softwiare package (Version 8.0) by Genetics Gomputer, Inc. [Devereiix et al:, s.a.]. 

Gene' amplification in B, subtilis: To amplify the copy nurriber of the SFGO in B. subtilis transformants, a single 
colony was inoculated in 15 ml VY-medium supplemented with 1.5 % glucose and 0.02 mg chloramphenieoj .or neomy- 
ein/ml. d^endend on the antibiotic resistance gene present in the amplifiable structure (see results and cJiscussion). 

40 The next day 750 ml of this culture-were used to inoculate 13 ml VY-medium containing 1.5% glucose supplemented 
with (60, 80, 120 and 150 mg/ml) for the cat resistant mutants, or 160 mg/ml and 180 mg/ml for the neomycin resistant 
mutants). The cultures were grown overnight and the next day 50 ml of different dilutions (1: 20, 1:400. 1: 8000, 1: 
160'000) were plated on VY agar plates with the appropriate antibiotic concentration. Large single colonies were then 
further analyzed to determine the number of copies and the amount of .carotenoids produced. 

45 Analysis of carotehoids: E. coli or 0. subtilis transformants (200 - 40q ml) were grown for the times indicated in 
the text, usually' 24 to 72 hours, in LB-medium or YY-medium. respisctively, supplemented with antiisiotics, in shake 
flasks at 37° C and 220 rpni. • • 

The carotenoids produced by the microorganisms were extracted with an adequate volume of acetone using a rota- 
tion homogenizer (Polytron, Kinematica AG, GH-Luzern). The homogenate was the filtered through the sintered glass 

so of a suction filter into a round bottom flask. The filtrate was evaporated by means of a rotation evaporator at 50° G using 
a water-jet vacuum. For the zeaxanthin detection the residue was dissolved in n-hexane/acetone (86:14) before analy- 
sis with a normalphase HPLG as described in [Weber. S-, s.a.]. For the detection of p-carotene and lycopene the evap> 
orated extract was dissolved in n-hexane/acetone (99:1) and analysed by HPLG as described in Hengartner et al.. s.a.]. 

55 
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Example 4 

Carotenoid production In E. co/i 

5 The biochemical assignment of the gene products of the different open reading frames (ORF's) of the carotenoid 

biosynthesis cluster of Flavobacten'um sp. were revealed by analyzing the carotenoid accumulation in E. coti host 
strains, transformed with plasmids carrying deletions of the Fiavobaciehum sp. gene cluster, and thus lacking some of 
the art gene products. Similar functional assays in E. coU have been described by other authors [Misawa et al. . s.a.; 
Perry et al.. J. BacterioL, 168 . 607-612 (1986); Hundle» et al.. Molecular and General Genetics 254 (4). 406-416 (1994)]. 

10 Three different plasmid pLyco, pBIIKS(+)-clone59-2'and pZea4*were constructeid from the three genomic isolates pBI- 
IKS(+)-cIone2. pBIIKS(+)-clone59 and pBIIKS(-f)-clone6a (see figure 16), 

Plasmid [dBI I KS{+)-clone59-2 was obtained by subclohing the Hindlll/BamHI fragment of pBIIKS(+)-'clone 2 into the 
HIndlll/BamHI sites of pBIIKS(+)-clone59. The resulting plasmid pBIIKS(+)-clone59-2 carries the complete ORF's of 
the crtE, crtB, crti and crtY gerie'and should lead to the production of p-caroteneV pLyco was obtained 'by deleting the 

75 Kpni/Kpnl fragment, coding for approx: one half (N -terminus) of the crtY gene, from the plashiid pBIIKS(+)-cJone59-2. 
E. coli cells^transformed with pLyco, and therefore having a^truncated non-functional crtY gene, shoulcJ produce lyco- 
pene, the precursor of p-carotene. pZea4 was constructed-by ligation of the Asci-Spel fragment of pBIIKS(+)-cIone59- 
2, containing the crtEi CrtB. crtI and most of the crtY gene with the Ascl/Xbal fragment of clone 6a, containing the crtZ 
gene and sequences to complete the truncated crtY gene m'entioned above. pZea4 has therefore all five ORF's of the 

20 zeaxanthin biosynthesis pathway.' E.*co// cells transformed with this latter plasmid should therefore'produce zeaxanthin. 
For the detection of the carotenoid produced, transformants were grown for 43 hours in shake flasks and then subjected 
to carotenoid analysis as described in the methods section: Figure 16 summarizes the construction of the plasmids ' 
described above. - ' ^j;.' . . ' t • . \t:r^. v \ ; 

As expected the pLyco carrying E. coll cells produced lycopene. those carrying pBIIKS(+)-clone59-2 produced p- 

25 carotenej{all-E.9-Z,13-Z) and the cells having the pZea4 construct produced zeaxanthin:^ Tfiis 'confirms that we have 
cloned all the necessary genes of Flavobacterium sp. R1534 for the synthesis of zeaxanthin or their precursors (phy- 
toene, lycopene and p-carotene). The production levels obtained are shown' in table -I- • ^ 



. plasmid 


host 


zeaxanthin - 


P*carqiene 


lycopene . >jj 


pLyco 


£. coli JM109 


ND 


ND 


0.05% ■ 


pBIIKS(+)-clone59-2. 






,0.03%.iv. 


. IsID - . • 


pZea4 




0.033% 


0.0009% ' ' 





45 



50 
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Table 1: • Carotenoid content of E. caZz- transformants, carrying the plasmids 
pLyco/pBnKS(^)-cIone59-2 and pZea4, after 43 hours' pf culture 
in shake flasks. The values indicated shq.w the carotenoid content. 
in % of the total dry cell mass (200 ml). Nt> = not detectable. 



Examples 5 ' ^ 

Carotenoid production in B. subtilis 

In a first approach to produce carotenoids in B. subtilis, we cloned the carotenoid biosynthesis genes of 
Fla\/obactenum into the Gram (+)/(-) shuttle vectors p602/22, a derivative of p602/20 [LeGrice, S.F.J., s.a.]. The assem- 
bling of the final construct p602-CARVEG-E, begins with a triple ligation of fragments Pvull-Avrll of pZea4(del654- 
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3028) and the Avrll-EcoRI fragment Irom plasmid pBIIKS(+)-clone6a, into the EcoRI and Seal sites of the vector 
p602/22. The plasmid pZea4(del654-3028) had been obtained by digesting p2ea4 with Sad and Espl. The protruding 
and recessed ends were made blunt with Klenow enzyme and religated. Construct pZea4(del654-3028) lacks most of 
the sequence upstream of crtE gene, which are not needed for the carotenoid biosynthesis. The plasmid p602-CAR has 

5 approx. 6.7 kb of genomic Ffavobacterium HI 534 DNA containing besides all five carotenoid genes (approx. 4.9 kb), 
additional genomic DNA of 1 .2 kb. located upstream of the crtZ translation start site and further 200 bp. -located 
upstream of crtE transcription start. The crtZ, crtY. crti and crtB genes were cloned downstream of the Pn25a) promoter, 
a regulatable E. co// bacteriophage T5 promoter derivative, fused to a lac operator element, which is functional in B, 
subtilis [LeGrice, S.F.J.. s.a.]. It. is obvious that in the p602CAR construct, the distance of over 1200 bp between the 

10 Pn25/o promoter and.the transcription start site of crtZ is not optimal and will be improved at a later stage. An outline of 
the p602CAR .construction is shown in figure 17. To ensure transcription of the crtE gene in B. subtilis; the vegl pro- 
moter [lyioran et^aL, Mol. Gen, Genet. 186. 339-346,(1982); LeGrice et al., Mol. Gen. Genet. 204, 229-236 (1 986)] was 
introduced upstrearp of this. gene, resulting in the plasmid construct p602-CARVEG-E. The vegl promoter,- which* origi- 
nates from site! of tbe^veg prorrioter complex.described by [LeGrice et al., s.a.] has been shown to be' functional iri E. 

IS colt [Moran et al..-s.a.]. To obtain this new construct, the plasmid p6Q2eAR was digested with Sal! and Hindlll; and-the 
fragment containing vthe complete crtE gene and most of the crtB coding sequence.»was subcloned into the Xhol and 
Hindlll sites of plasrnid p2P5. The resulting plasmid p205CAR contains the crtE gen e'just downstream of the Pvegl pro- 
moter., Jo reconstitute the carotenoid gene cluster of Flavobacterium sp. the following three* pieces were- isolated:^ 
Pmel/HindlJI fragment of p205CAR,ithe'Hincll/XbaLfragment and^the EcoRI/Hindlll fragment of p602GAR'and ligated 

20 into the-EcoRI. and. Xbal sites, of pBluescriptHKS(+), resulting inithexonstruct pBIIKS(+)-CA,RVEG-E.^ Isolation of the • 
EcoRljXbal;fragment,pf.this latter plasmid and ligation into the EcoRhand Xbal sites of 'p602/22 gives a plasmid similar^ 
to p602GAR. but having, the crtE-gene driven .by the \Pveg I promoter. All the construction: 'stejDs to get the plasmid ' 
p6p2CARVEG-E:a.re outlined in figure 1 8. ,£iCO// TGI? cells transformed with.this plasmid synthesized^zeaxanthin; In 
contrjast k syd////s strain . 1012,transformed with the same constructs did not produce any carotenoids. Analysis of sev- 

25 eral zeaxarithin negative S. subtilis transformants always revealed.] that the triansformed plasmids had undergone 
severe deletions., This instability could be due. to the large size of the. constructs. . ; ■ r v. . " e • . 

In order to obtain a stable construct. in-S. subtilis, the carotenoid geneis wer-e cloned into the ^Grarh (+)/(•) shuttle 
vector pHP1 3 constructed by [Haima'et al;, sia.]. The stability-problems were thought to be bmitted by 1) reducirig the^ 
size of the cloned insert which carries the carotenoid genes and 2)= reversing the orientation of'the crtE 'gene' and thus 

30 only requiring one promoter for the expression of all five genes; instead of two. like in the previous constr ucts.^^Furth eV- 
more.ttheiabilityof. cells transformed by such a plasniid carrying the syhiheX\c 'Fla\^badenum'car^^ opeVbri'' 
(SFQO).?to produce. carotenoids, would answer the question iia-modular approach is' feasible!* FigCir'eM 9 summarizes^ 
all the construction steps and intermediate plasmids made to' get the final construct*pHP13-2PNZYIB-EINV. BPiefly: To 
facilitate the, following constructions, a vectorpHP 13-2 was made;' by intrbdu'cirig'a synthetic linker obtairied Witli'prirher 

35 CS1 and-GS2.^ between the Hindlll and EcoRr sites of the shuttle vector pHP 13.' The interrhediate construct 'pHP13- 
2CARVEG:E was. constructed by subcloning the:Aflll-Xbal fragmeht of p602CARVEG-E into the Aflll-and Xbal sites of ' 
pHP13r2:t„The'next step: consisted, in the; inversion 'of crtE gene,; by Vemovirig 'Xbal arid- Avrll fragment containing^the' 
original crtE gene and replacing it with the Xbal-Avrll fragment of^plasmid fDBIIKS(+)-PCRRBScrtE; The requiting plas- 
mid was:named pHP13-2CAR2YIB-EiNV and represented the first construction with a functibrial 'SFCO. the'lntefme- 

40 diate construct pBIIKS(+)rPGRRBScrtE* mentioned above, was obtained- by digesting the' PGR* product generated with 
primers #100 and .#101 withrSpel and Snial'andligating into the Spel and Srnal sites of pBIuescriptllKS(+)'. In oVderto 
get thecrtZ transcription start close to the promoter Pn25/o a 'triple ligatioh^was 'done with thle BamJ^NSalilragrrient of ' 
pHP13-2GARZYIB-EINV (contains four of the five carotenoid genes),'the BamHI-EcoRI fragmeht of "the same pia'smid 
containing the Pt^ib/o promoter.and the EcoRI-Sall fragment of pBllKS(+)-PGRRBScrtZ.' having most of the crtZ gene 

45 preceded by a , synthetic RBS.: The aforementioned plasmid. pBIISK(+)-PCRRBScrtZ was obtained^ by digesting the 
PGR product amplified with "prihriers #104 and #105^with EcoRI and Sail and ligatihg into the EcoRI and Sail sites of ' 
pBluescriptllSK(-f)^ In the resulting vector pHP13-2PN25ZYIBrEINV. the SFCO is~ariven by the bacteriophage T5 prd-"^ 
moter P-i^sfo* which should be constitutively expressed, due to the absence of a functional lac repressor iri the "construct 
[Peschke.and.Beuk, -J.,Moi;Biol. 186. 547-555 (1985)], E, co// TG1 cells transformed with this construct produced zeax- 

50 anthin. Nevertheless, when this plasmid' was transformed into^S. subiilis, no carotenoid production cbuid be detected. - 
Analysis of the plasmids of these transformants showed severe deletions.* pointing towards instability problems, sirriilar 
to the observations made with the aforementioned plasmids. : • • - 

Examples 6 ■ • , • . , 

Chromosome Integration Constructs 

Due to the instability observed with the previous constructs we decided to integrate the carotenoid biosynthesis 
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genes of Flavobacterium sp. into the genome of B. subtHis using the integration/expression vector pXI12. This vector 
allows the constitutive expression of whole operons after integration Into the levan-sucrase gene (sacB) of the B, 
subWis genome. The constitutive expression is driven by the vegl promoter and results in medium level expression. The 
plasmid pXI12-ZYIB-EINV4 containing the synthetic Flavobacterium carotenoid operon (SFCO) was constructed as fol- 
5 lows: the Ndel-Hincll fragment of pBIISK(+)-PCRRBScrtZ was cloned into the Ndel and Smal sites of pXI12 and the 
resulting plasmid was named pXI12-PCRcrtZ.. In the next step, the BstEII-Pmel fragment of pHP13-2PN25ZYIB-EINV 
was ligated to the BstEII-Pmel fragment of pXI12-PCRcrt2 (see figure 20). B. subtilis transformed with the resulting 
construct pXll2-ZYIB-EINV4 can integrate the CAR genes either via a Campbell type reaction or via a reciprocal 
recombination. One transformant.,BS1012:.:ZYIB-EINV4, having a reciprocal recombination of the carotenoid biosyn- 
70 thesis genes into the levan-sucrase gene was further. analyzed (figure 21).- Although this strain did not synthesize car- 
otenoids. RNA analysis by Northern blots showed the presence of specific polycistronic mRNA's of 5.4 Wd and 4.2 kb 
when.hybridized to probe A (see:.figure 21. panel B). Whereas the- larger m RNA has the expected message size, the 
origin. of the shorter mRNA was unclear.- Hybridization of the same Northern blot to probe B only detected the large 
mRNA fragment; pointing towards a prerhature termination of the transcription' at the" end of the crtB gene. The pres- 
15 ence.of a termination signal at.this location would make sense, since In the original operon organisation in the Flavo- 
bacterium sp. R1534 genome.-.the crtE and the crtB genes are facing each other. With this constellation a transcription 
termination signal at the S.'.end of crtB would make sense, in order to avoid the synthesis of anti-sense RNA which could 
interfere with the mRNA transcript of the crtE gene. Since this region has been changed' considerably with respect to 
the vyild.type situation, the sequences constituting this terminator may also have been altered resulting in a "leaky" ter- 
20 minator. Western blot analysis using antisera against the -different crt enzymes of' the carotenoid path wayr pointed 
towards the possibility that.the ribosomal binding sites might be responsible for the lack of carotenoid synthesis. Out 'of 
the five genes introduced only the product of crtZ, the p^carotene hydroxylase was detectable. This is the only gene pre- 
ceded by a.RBS site., originating from the' pXn 2 vector, known to be functional in B. subtilis. Base pairing interactions ' 
between a mRNA's Shine-Dalgarno sequence [Shine and Delagarno. s. a.]'and the 16S rRNA. which permits the ribds- 
25 ome to select the proper- initiation- site,, have been proposed by [McLaughlin et a!.; J. Biol; Chem.-2§6; 11283-11291* 
(1981)] to be much more stable in Gram-positive organisms (B. subtilis) than In Gram-negative organisms {E. coli).'\n • 
order to obtain highly stable complexes we exchanged the RBS sites of the Gram-negative Flavobacterium sp, . preced- 
ing.each of the genes crtY, crtl.^crtB and crtE; with, synthetic. BBS's which were'designedicomplementary to the 3'iend - 
of the a Siy6////s 165 rRNA^(see,table'2), This exchange should allow an effective translation initiation of the different 
30 carotenoid genes in B^subti/is. Jhe strategy chosen to construct this pXI12-ZYIB-EINV4MUTRBS2C, containing all four ' 
altered..sites is summarized In.figure 20. , In order to facilitate the further cloning steps in pBluescriptllKS(+);' additional 
restriction.sites were introduced, using the linker obtained with primer MUT7 and MUT8, cloned between the Sall aricJ' 
Hindlll sites of said vector. The nevy. resulting construct pBIIKS(+)TblNKER78 had the.following, restriction sitesJntro- * 
duced: Avrll. Pmll.^iyiull, Muni. BamHj and^SphJ- The general approach chosen to create the synthetip RBS's upstream . 
35 of the different carotenoid genes,: was done using; a-connbination. of i PGR- based mutagenesis, where the genes were ' : 
reconstructed using defined.primers,carrying the modified RBS sites, or using synthetic linkers having such sequences. ' 
Recqristitulipn oMhe p.BS preceding the>crtl and crtB.geoes was done by amplifying the crti gene with the -primers 
MUT2 and MUTe.. which include the appropriate altered RBS sites: The RGR-I fragment obtained was digested with 
Muni. and .BamHI and ligated into, the Muni and BamHI sites of:pBnKS(+)-LINKER78. The resulting intermediate con-" 
40 struct; was named pBI!KS(+)-LINKER78RCRI. Reconstitutlon. of the RBS preceding the'crtB gene was done' using a 
small RCR fragment .obtained with prirner MUT3. carrying , the altered RBS. site upstream of crtB. ahd primer CAR17. • 
The, amplified PC R-F fragment was digested with BamMI,and.Hind|ll and sub cloned intothe BamHI and Hindlll sites' 
of pBnKS(+)-LINKER78. resulting in the construct pBnKS(+)-LINKER78PCRR The PCR-I fragment was cut out of pBI- 
IKS(+)-LINKER78PCRJ with BamHUnd Sapl and ligated into the BamHI and Sapl sites of pBIIKS(+)-LINKER78PCRF 
45 The.resulting.plasmid:pBIIK5(t)rL!NKE.R has the: PGR-I fragment fused to the PCR-:F fragment. This con- 

struct was cut^wim Sail and.pmljant^. a synthetic linker obtained by annealing of primer.MUT9 and:MUT10 was intro- 
duced. This latter step was done. to facilitafejhe upcoming replacement of the original Flavobacterium BBS in the above 
mentioned construct. The resulting plasmid was named pBIIKS(+)-LINKER78PCRFIA7. Assembling of the synthetic 
RBS's preceding the crtY and crtl.genes was done-by PCR. using primers MUT1 and MUTS. The'amplified fragment ^• 
so PCR7G was made blunt end befpre.cloning into the Smal site of pUC1 8, resulting in construct pUC18-PCR-G. The next 
step was the. cloning, of the PCR-G. fragment between the RCB-A and PCR-I fragments. For this purpose.the PCR-G - 
was isolated from pUC18-PCR-G by digesting with MunI and Pmll and ligated into the Muni and Pmll sites of pBllKS(+)- 
LINKER78PCRFIA. This construct contains all four fragments. PCR-F. PCR-I. PCR-G and PCR-A, assembled adjacent 
to each other and containing three of the four artificial RBS sites (crtY, crtI and crtB). The exchange of the 
55 Flavobacterium RBS's preceding the genes crtY. crtI and crtB by synthetic ones, was done by replacing the Hindlll-Sall 
fragment of plasmid pXI12-ZYIB-EINV4 with the Hindlll-Sall fragment of plasmid pBIIKS(+)-LINKER78PeRFIGA; The 
resulting plasmid pXI12-ZYIB-EINV4 MUTRBSC was subsequently transformed Into E. coli TGI cells and B. subtilis 
1012. The production of zeaxanthin by these cells confirmed that the. PCR amplified genes where functional. The 0. 
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subtilis strain obtained was named BS1012::SFCOl. The last Flavobacterium RBS to be exchanged was the one pre- 
ceding the crtE gene. This was done using a linker obtained using primer MUT1 1 and MUT12. The wild type RBS was 
removed from pXI12-2YIB-EINV4MUTRBS with Ndel and Spel and the above mentioned linker was inserted. In the 
construct pXII 2-ZYIB-EINV4MUTRBS2C all Flavobacterium RBS's have been replaced by synthetic BBS's of the con- 
sensus sequence AAAGGAGG- 7-8 N -ATG (see table 2). E. coli TGI cells transformed with this construct showed that 
also this last RBS replacement had not interferred 

Table 2 

mRN A nucleotide sequence 

crtZ AAAGG;^GGGUUUCAUAIiiGAGC 

crtY AAAGGAGGaGACGUGAUGAGC 

crti AAAGGAGGCAAUUGAGAIiClAGU 

crtB AAAGGAGGAUCCAAUCAIiGACC 

crtE AAAGGAGGGUUUCUUAIiGACG 



25 



B, subtilis 'les rRNA . S'-UCUUUCCUCCACUAG 

E. coli 16S rRNA 3 - AUUCCUCC AGUAG 

35 • Table 2: Nucleotide sequences of the synthetic ribosome binding 

sites in the constructs pXI12-ZYIB-EnW4MUTRBS2C, 
PXI12-ZYIB-EI1W4MUTRBS2CCAT and pXI12-2YIB- 
E1NV4 MUTRBS2CNEO. Nucleotides of the Shine- 
Dalgarno seqiaence. preceding the individual carotenoid 
genes which are complementary to the 3' ends of the 16S 
rRNA of B, subtilis are shown in bold. The 3* ends of the 16S 
rRNA of E. coli is also shown as cbmparispn. The 
underlined AUG is. the translation start site of the • 
mentioned gerie. 



with the ability to produce zeaxanthin. Ailt^he/egions containing the newly introduced synthetic RBS's were confirmed 
by sequencmg. S. subtilis cells were transformed with plasmid.pXI12-ZYIB:EINV4MU^ one transforrnant 

having integrated .the SFCO by reciprocal ' recpmbihation, . into the levan-sucrase^gene of the chrornosome, was 
55 selected. Jhis.strain. was named BS1012::SFCO2. Analysis of the carotenoid production of this strain show that the . 
amounts zeaxanthin produced is apprpx. 40% of the zeaxanthin produced by E. .cp//\cells,transforrned with the plasmid 
used to get the B. subtilis transforrnant. Similar was the observation when comparing the BS1012::SFCO1 strain with 
its E. coli counter part (30%). Although the E. coli cells have 1 8 times more carotenoid genes, the carotenoid production 
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Is only a factor ot 2-3 times higher. More drastic was the difference observed In the carotenoid contents, between E. co/i 
cells carrying :the pZea4 construct in about 200 copies and the E. coti carrying the plasmid pXI12-ZYIB- 
EINV4MUTRBS2C In 18 copies. The first trahsformant produced 48x more zeaxanthin than the latter one. This differ- 
ence seen can not only be attributed to the roughly 1 1 times more carotenoid biosynthesis genes present in these trans- 

5 formants. Contributing to this difference is probably also the suboptimal performance of the newly constructed SFCO. 
in which the overlapping genes of the wild type Flavobacterium operon were separated to introduce the synthetic 
BBS's. This could have resulted in a lower translation efficiency of the rebuild synthetic operon (e.g. due to elimination 
of putative translational coupling effects, present in the wild type operon). 

In order to increase the carotenoid production, two new constructs were made, pXI12-ZYIB- 

10 EINV4MUTRBS2CNEO and pXI12-ZYIB-ElNV4 MUTRBS2CCAT which after the integration of the SFCO into the 
levan-sucrase site of the chromosome, generate strains with an amplifiable structure as described by [Janniere et al.. 
Gene 40. 47-55 (1985)]. Plasmid pXM2-ZYYB-EINV4MUtRBS2CNEO has been deposited on May 25. 1995 at the 
DSM-Deutsche Sarnmlung von Mikroorganismen und ZeMkulturen GmbH (Germany) under accession No; D.SM 10013. 
Such amplifiable istructures. when linked to a resistance rfiarke'r (e.g chloramphenicol, neomycin, tetracycline), can be 

15 amplified to 20-50 copies per chromosome. The amplifiable structure consist of the SFCO. the resistance gene and the 
pXI12 sequence, flanked by direct repeats of the sac-B 3**gene (see figure 22). New strains having elevated numbers 
of the SFCO could now be obtained by selecting for transformants with increased level of resistance to the antibiotic. 
To construct plasmid pXI12-ZYIB-EINy4MUTRBS2CNEO. the neomycin resistance gene was Isolated from plasmid 
pBEST501 with Pstl and Smal and subcloned into the PstI and EcoOIOSI sites of the pUCI 8 vector. The resulting con- 

20 struct was named pUCl8-Neo. To get. the final - construct, the Pmel - Aatll fragment of plasmid ;pXI1 2-ZYIB- 
EINV4MUTRBS2C-'was replaced with the'Smal-A^tll frag merit of pUC18-Neo. containing the neomycin resistance 
gene. Plasmid pXI12-ZYIB-EINV4MUTRBS2CCAT was obtained as follows; the chloramphenicol resistance gene of 
pC194 was isolated by PGR* using the prirher pair cat3 and cat4''The fragment was digested with EcoRI and Aatll and 
subcloned into the EcoRI and Aatll sites of pUC18. The resulting plasmid was named pUG18-CAT. The final vector was 

25 obtained by replacing the Pmel-Aatll fragment of pXM2-ZYIB-EINV4MUTRBS2C with the EcoRI-Aatll fragment of 
pUC18-CAX carrying the chloramphenicol resistance gene. Figure 23 summarizes the different steps to obtain afore- 
mentioned constructs. Both plasmids were transformed into B. subtilis strain 1012. and transformants resulting from a 
Campbell-type integration were selected. Two strains BS1012::SFCONEO1 and BS1012::SFCOCAT1 were chosen for 
further amplification. Individual colonies of both strains were independently amplified by growing them In different con- 
so centrations of antibiotics as described in the rnethods section. For the cat gene carrying strain,. the chloramphenicol 
concentrations were 60,' 80, 120 and 150 mg/nil. For the neo gene carrying strain, the neomycin concentrations were 
160 and 180 mg/ml. Tn both strains only strains with' minor amplifications of the SFCO's were obtained. In daughter 
strains generated from strain BSl012::SFCONEO1, the resistance to higher neomycin concentrations correlated with 
the increase in the number of SFCO's in the chromosome and with higher levels of carotenoids produced by these cells. 

35 A different result was obtained with- daughter strains obtained from strain BSi012::SFCOCAT1. In these strains an 
increase up to 150 mg chlprarnphenicol/rni resulted, as expected, in a higher number of SFCO copies in the chromo- 
some. 

Example 7 . . * ■ 

40 

Construction of CrtW containing plasmids and use for carotenoid productloh 

Polymerase chain reaction based gene synthes/s. The nucleotide sequence of the artificial crtW gene, encoding 
the p-carotene p-4-oxygenase of >4/ca//^enes strain"PC--1.^was obtained by back translating'the amino acid sequence 

45 outlined in [Misawa. 1995]. using the BackTranslate program of the GCG Wiscorisin Sequerice- Analysis Package, Ver- 
sion 8.0 (Genetics Computer Group. Madison, Wl, USA) and a codon frequency reference" table of E. coli (supplied by 
the Bach Translate Program). The synthetic gene consisting of 726 nucleotides was constructed basically according to 
the method described by [Ye. 1992]. The sequence of the 12 oligonucleotides (crtWI -.crtW12) required for the synthe- 
sis are shown in Figure 25. Briefly, the long oligonucleotides were designed to have short overlaps of 15-20 bases, serv- 

50 ing as primers for the extension of the oligonucleotides. After four cycles a few copies of the full length gene should be 
present which is then amplified by the two terminal oligonucleotides crtW15 and crtW26. The sequences for these two 
short oligonucleotides are for the forward primer crtWI'5 (5 -TATATCTAGAcat atofCCGGTCG^ CCGG -a') 'arid for 
the reverse prirher crtW26 (5'-TATAQaattccacQtaTCA- AGCAGGACCACCGGlTT^ G -'sT where the' sequences 
matching the DNA templates' are underlined. Small cap letters sHbw thelhtroduced 'restriction' sites (Nde\ ior the for- 

55 ward primer and FcoRl and Pml\ for the reverse'primer) for the latter clonirig into the'pALTER-Ex2 expression vector. - 
Poiym'erase chain reaction. All twelve long oligonucleotides (crtWI -crtW12: 7 hM each)Vnd both terminal primers 
(crtW15 and crtW26; 0.1 mM each) were mixed and added to a PGR reaction mix containing Expand^ High Fidelity 
polymerase (Boehringer, Manriheirii) (3.5 units) and dNTP's (100 mM each). The PCR reaction was run'fbr 30 cycles 
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with the following profile: 94 ®C for 1 min, 50 for 2 min and 72 °C for 3 min. The PGR reaction was separated on a 
1% agarose gel, and the band of approx. 700 bp was excised and purified using the glass beads method (Geneclean 
Kit, Biol 01. Vista, CA, USA). The fragment was subsequentely cloned into the Smal site of plasmid pUC18. using the 
Sure-Clone Kit (Pharmacia, Uppsala, Sweden). The sequence of the resulting crtW synthetic gene was verified by 
5 sequencing with the Sequenase Kit Version 1 .0 (United States Biochemical, Cleveland, OH, USA). The crtW gene con- 
structed by this method was found to contain minor errors, which were subsequently corrected by site-directed muta- 
genesis. 

Construction ofplasmids. Plasmid pBIIKS(+)-CARVEG-E (see also Example 5) (Figure 26) contains the caroienoid 
biosynthesis genes (crtE. crtB, crtY, crti and crtZ) of the Gram (-) bacterium Flavobacterium sp. strain 1=11534 WT 

10 (ATCC 21588) [Pasamontes, 1995 #732] cloned, into a modified pBluescript II KS(+) vector (Stratagene, La Jolla. USA) 
carrying site 1 of the B. subtifis veg promoter [LeGrice, 1986 #806]. This constitutive promoter' has been showri to be 
functional in E. coli. Transformants of E. co//; strain TGI carrying plasmid pBIIKS(+)-eARVEG-E synthesise zeaxanthin. 
Plasmid'pALTER-Ex2-crtV^ was constructed by clonjng the A/del - EcoRI restricted fragment of the synthetic .crlW gene 
into the corresponding sites of plasmid pALTER-Ex2 (Promega, Madison, Wl). Plasmid pALTER-Ex2 is a iow copy plas- 

IS mid with the p15a origin of replicatiiDn, which allows it to be maintained with ColEi vectors in the' same host. Plasmid 
pBIIKS-crtEBIYZW (Figure 26)' was obtained by cloning the )Hind\\\-Pmt\ fragriient of pALTER-Ex2-crtW into the H}nd\\\ 
and the blunt end made Mlu\ site obtained by a fill in reaction with Klenow enzyme, as described elsewhere in [Sam- 
brook, 1989 #505]. Inactivation of the crtZ gene was done by deleting a 285 bp Nsi\-Nsi\ fragment, followed by a fill in 
reaction and religation. resulting in plasmid pBIIKS-crtEBlY[DZ]W. Plasmid pBIIKS-crtEBlY[D2W] carrying the non- 
20 functional genes.crtW and crtZ-, ,was constructed by. digesting the plasrTiidipBIIKSrCCtEBIYIDZJWuWith -A/c^el and. Wpal,,- 
and subsequent self religation of the plasmid after filling in the sites with Klenow enzyme. E. coii transformed with this 
plasmid had a yellow-orange colour due to the accumulation of p-carotene; Plasmid pBIIKS-crtEBIYZIDW] has.a trun- 
cated CTtW gene obtained byj deleting the NdeV^ Hpa\ fragment in plasmid pBMKS-crtEBIYZWias outline?d abover.Plas- 
mids :pALTER-Ex2-crtEBIY[DZW] and pALTER-Ex2-crtEBIYZ[DV\/]. were obtained -.by. isolating - the BamHhXba\ 

25 fragment from. pBIIKS-crtEBlY[DZW] and pBllKS-crtEBlYZ[DW], respectively andxioning them. into the Ba/r? HI and 
Xba\ sites of pALTER-Ex2. The plasmid pBllKS-crtW was constructed by digesting-pBllKS-crtEBIYZW.with- A/s/Tand 
Sad, and self -r.eligating the plasmid after recessing the DNA overhangs with Klenow. enzyme. Figure.27 .compiles. the 
relevant inserts of . all the plasmids used in this paper.' ■ , . . • ; ■; i • . =1 . -v' ■ 

Caroienoid analysis. E. coli TG-I transforrnants carrying the different plasmid constructs were grown :for 20 hours: 

30 in Luria-Broth medium supplemented with .antibiotics (ampicillin 100 mg/ml.^tetracyclin 12.5 mg/ml).in shake flasks at. 
37**C andT220.rpm; Carotenoids were extracted from the cells with acetone.:The acetone was removed in vacuo and 
the residue.was re. dissolved in toluene. The coloured solutions were subjected, to high-performance Jiquid chromatog- 
•raphy-{HPLC) analysis which was performed on a Hewlett-Packard .series .1050 Instrument. The carotenoids weresep-, 
:arated on.a.silica column Nucleosil-SI -'1.00. 200 x 4 mm, 3m. Thet solvent system included two solvents: hexane (A) . 

35 and'hexane/THF;:..1;1 (B). A linear gradient was applied running from^13.to 50 % (B) within 15 minutes.^.The.f low rate 
was.:1.5 ml/min. Peaks were detected at 450 nm by a photo diode array detector. The individual carotenoid pigments . : 
were identified by their absorption spectra and typical retention times as compared to reference samples of. chemically 
pure carotenoids, prepared by chemical synthesis and characterised by NMR, MS and UV-Spectra. HPLC analysis of 
the pigments isolated from E. co// cells transformed with plasmid pBtlKS-crtEB I YZW, carrying besides the carotenoid 

40 biosynthesis genes of Flavobacterium sp. strain R1534, also the crtW gene encoding the p-carotene ketolase of 
Alcaligenes PC-1 [Misawa. 1995 #670] gave the following major peaks identified as: brcryptoxantliin, astaxanthln. ado- 
nixanthin and zeaxanthin, based on the retention times and on the comparison of the absorbance spectra to given ref- 
erence samples of chemically pure carotenoids. The relative amount (area percent) of the accumulated pigment in the 
E. coli transformant carrying pBIIKS-crtEBIYZW is shown in Table 3 ["GRX": cryptoxanthin; "ASX": astaxanthin; "ADX": 

45 adonixanthin; "ZXN": zeaxanthin: "ECM": echinenone; "MECH": 3-hydroxyechinenone, "CXN": cantaxanthin]. The Z of 
the peak areas of all identified carotenoids was defined as 100%. [slumbers shown in Table 3 represent the average 
value of four independent cultures for each transformant. In contrast to the aforementioned results, E. coli transform- 
ants carrying the same genes but on two plasmids namely. pB|IKSrcrtEBIYZ[D,W] and pALTERrEx2-crtW. showed- a 
drastical drop in adonixanthin and a complete lack of astaxanthin pigments (Table 3), whereas the relative amount of 

so zeaxanthin, (%) had increased. Echinenone, hydroxyechinenone and canthaxanthin levels remained unchanged com- 
pared'to.the transformants carrying all the crt genes on the same plasmidi(pBllks-crtEBIYZDW): Plasmid pBl IKS-cr.tE- 
B1YZ[PW] is a high copy plasmid carrying the.functional genes of crtE,.crtB, crtY, crtI, criZ oi' Flavobacterium sp. strain » 
R1 534, and a ! truncated, non-functional- version of ithe crtW gene, whereas .the- functional copy of -the'CrtWjgene. is 
located on the jow copy plasmid. pALTER-Ex2K;rtW. To. analyze the .effect of over expression of the crtW gene,with- 

55 respect.to the crtZ gene. E. co// cells were co'transformed with plasmid pBIIKS-crtW carrying the crtW gene on the high 
copy pjasmid pB.IIKS-crtW and the low copy construct pALTER-Ex27crtEBIYZ[DVV].-.encoding Vt\e Flavobacterium ct\ 
genes. Pigment analysis of these transformants by HPLC monitored the presence of p-carotene, cryptoxanthin, astax- 
anthin, adonixanthin. zeaxanthin. 3-hydroxyechine-none and minute traces of. echinenone and canthaxanthin (Table 3). 
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Transformants harbouring the crtW gene on the low copy plasmid pALTER-Ex2-crtW and the genes crtE, crtB, crtY 
and crtl on the high copy plasmid pBllKS-crtEBIY[DZW] expressed only minor amounts of canthaxanthin (5 %) but high 
levels of echinenone {94%), whereas cells carrying the crtW gene on the high copy plasmid pBIIKS-crtW and the other 
crt genes on the low copy construct pALTER-Ex2-crtEBIYIDZW], had 78.6% and 21.4 % of echinenone and canthax- 
5 anthin, respectively (Table 3). 



Table, 3 



plasmids - 


CRX 


ASX 


ADX 


ZXN 


ECH- 


HECH 


CXN 


p.BIIKS-crtEBIYZW ' 


.1.1 


.2.0 


44:2 


,52.4 , 


< 1 


<1 


< 1 . 


pBHKS-crtEBIYZ[AW] + pALTER-Ex2-crtW 


2.2 




25.4, 


72.4 


< 1 


< 1 


< 1 . 


pBHkS-crtEBIY[AZ]W ' 
pB|IKS-crtEBIY[AZW] + pBIIKS-crtW 










66.5 




33.5 










94 . 
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Examples ' - ♦ t... - . 

20 Selective caroterioid production by using the crtW and crtZ genes of the Gram negative bacterium E-396 .' 

In this section.we describe Eydoli transformants which accumulate only one (canthaxanthin) or two main caroten- 
oids (astaxanthih, adonixanthih) iand minor amounts of adbnirubin,' rather than the complex 'variety of carotenoids seen 
in most carolenoid producing bacteria fVokoyama et al., Bi osci.' Biotechno). Biochem.- 58i1 842-1 844 (1994)] and some 

25 of the.£'.co// transformants shown in Table 3. The ability to construct strains producing only one caroterioid is a major 
step towards a successful biotechnological carotehoid production process; This increase in the accumulation of individ- 
ual carotenoids accompanied by a decrease of the intermediateis. was obtained by replacing the crtZ di Fiavbbacterium 
R1 534 and/or the synthetic crtW gene (see example 5) by their homologous genes originating from the astaxanthin pro-- 
ducing Gram negative bacterium E^396 (PERM BP-4283) n"subokura et al.. EP-applicatibn 0 635 576 A1]. Both genes, 

30 crtWE396 and crtZE396, were isolated and usied totcoristruct^new plasmids as ou^ ' . . 

I Isolation oi a" putative fragment oi the crtWgene of strain 6-396 by the polymerase chain reaction. Based on pro- 
tein sequence' comparison of the^crtW enzymes oi Agrobacterium aurantiacum; Alcaligehes PC-1 (WO95/18220) [Mis- 
awa et ah; J.Bacteriol.'177: 6575-6584 (1995)] and Haematococcus pluvialis- lKa]\wara et al.. Plant M6I. Biol. 29:343-'.i 
352;(1995)][Lotan et al.. FEBS-letters, "364:1 25-1 28 (1995)], two regions named I and II, having high amino acid con- 

35 servation and Ideated approx. 140 amino acids appart; were identified and chosen to design the degenerate PGR prim-' 
ers shown below. The N-terminal peptide HDAMHG (regiori I) was used to design' the' two 17-mer degenerate primer 
sequences crtW1 00 and crtWIOf: • 1^ 1 - ■ ' • v .r • v i : -.j *.i v 

crtW100: 5'-CA(CyT)GA(C/T)GC(A/C)ATGCA(C/T)GG-3? . . 

40 ■ ■ 1: • ■ ~ • ■ *' " • ' • ' ' • • ■ 

crtW101:5'-CA(C/T)GA(G/r)GC(G/r)ATGGA(C/T)GG-3' 'r- i V 

The C-terminal peptide H(W/H)EHH(R/L) corresponding to region II was used design the two 17-nier degenerate* ■ 
primer with the antisense sequences crtWI 05 and crtWI 06: ' 

45 • • . f:.^ - . ■ ■: ^ * ■ " . - . . ^ : . . 

crtW105:5'-AG(G/A)TG(G/A)TG(T/C)TG(G/A)TG(G/A)TG-3' , ■ r 

1 . • . • ; • i ' ^ * ;I • , i- 1. *. . « r • ■ » 

crtW106: 5'-AG(G/A)TG(G/A)TG(T/C)TCCCA(G/A)TG-3' * 

50 Polymerase chain reaction. PGR was performed using the GeneAmp Kit (Perkin Elmer Getus) according to the 

manufacturer's: instructions.*- The" different PGR reactions^ contained corhbi nations of the- 'degenerate* -primers - 
(crtW100/crtW105 or crtWIOO/crtWIOS or crtW101/crtW105 or crtW101/crtW106)'at a final concentration ^of' 50' pM " 
each, together with genomic DNA of the bacterium E-396 (20*0 ng) and 2.5 units of Taq polymerase In total 35 cycles 
of PGR were performed with the following cycle profile: 95 "G for 30 sec, 55**C for 30 sec. 72 °G for 30 sec. PGR reac- 

55 tions made with the following primer combinations crtW100/crtW105 and crtW101/crtW105 gave PGR amplrfication 
products of approx: 500 bp "Which were- in accordance 'with -the' -expected: fragment size. The "500* b^ fragment.* 
JAPcloneS, obtained in the PGR reaction using primers crtWIOl and crtW105 was excised from an 1.5% agarose gel 
and purified using the GENEGLEAN Kit and subsequently cloned into the Sma\ site of pUGl 8 using the Sure-GIone Kit. 
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according to the manufacturer's instructions. The resulting plasmid was named pUC18-JAPclone8 and the insert was 
sequenced. Comparison of the determined sequence to the crtW gene of Agrobacterium aurantiacum (GenBank 
accession n* D58420) published by Misawa et al. in 1995 (WO95/18220) showed 96% identity at the nucleotide 
sequence level, indicating that both organisms might be closely related. 

5 Isolation of the crt cluster of the strain E- 396. Genomic DNA of E-396 was digested overnight with different com- 

binations of restrictions enzymes and separated by agarose gel electrophoresis before transferring the resulting frag- 
ments by Southern blotting onto a nitrocellulose membrane. The blot was hybridised with a ^^P labelled 334 bp 
fragment obtained by digesting the aforementioned PGR fragment JAPcloneS with BssHII and MJu\. An approx. 9,4kb 
Eco RI/BamHI fragment hybridizing to the probe was identified as the most appropiate for cloning since it is long enough 

10 to potentially carry the complete crt cluster. The fragment was isolated and cloned into the EcoRI and BamHl sites of 
pBluescriptllKS resulting in plasmid pJAPCL544 (Fig. 29): Based on the sequence of the PGR fragment JAPcloneS, 
two primers were synthesized to obtain more sequence information left and right hand of this fragment. Fig. 30 shows 
the sequence obtained containing the crtWease (from nucleotide 40 to 768) and crtZgage (from nucleotide 765 to 1253) 
genes of the bacterium E-396. The nucleotide sequence of the crtW^sgg gene is shown in Fig. 31 and the encoded 

75 amino acid sequence in Fig. 32. The nucleotide sequence of the crtZEsgg gene is shown in Rg. 33 and the correspond- 
ing amino acid sequence in- Fig. 34. Comparison to the crtW^sgs gene of E-396 to the crtW gene of A. aurantiacum 
showed 97 % identity at the nucleotide level and 99% identity at the amino acid level. For the crtZ gene the values were 
98 % and 99 %, respectively. • . . • . • . . - 

Construction. of plasmids: Bo\\\ genes, crtWeags and crtZgage, which are adjacent in the genome of Et393, were 

20 isolated by PGR using primer crtW107 and crtW1 08 and the ExpandTM High-Fidelity PGR system of Boehringer Man- 
nheim, according to the manufacturer's recommendations. To facilitate the subsequent cloning steps (see section 
below) the primer crt107 (5'-ATCAIArGAGCGGAGAT(3GGGTGGCCAAGGG-3') contains an artificial Nde\ site (under- ^ 
lined sequence) spanning the ATG start codon of the crtWeage gene and the reverse primer crtWIOS (5'-A TCTCGAG T- 
CAGGTGCGCTCGTGGGCCTGGGCG-3') has an Xho\ site (underlined sequence) just downstream of the .TGA stop 

25 codon of the crtZ £395 gene. The final PGR reaction mix had 10 pM of each primer, 2.5 mg genomic DNA of the bacte- 
rium E-396 and 3.5.units of the TaqDNA/Pwo DNA polymerase mix. In total 35 cycles were performed with the following 
cycle profile: 95 *'G, 1 min; 60 *G. 1 min; 72 °C 1min 30 sec. The PGR product of approx. 1250bp was isolated from the 
1% agarose gel and purified using GENEGLEAN. before ligation into the Sma\ site pUG18 using the Sure-Glone Kit. 
The resulting construct was named pUG18-E396crtWZPGR (Fig. 35), The functionality of both genes was tested as.fol- 

30 lows. The crtWE396 and crtZ £395 gene were isolated from plasmid pUC18-E396crtWZPGR with Nde\ and Xho\ and 
cloned into the A/del and Sa/I site of plasmid pBIIKS-crtEBIYZW resulting in plasmid pBIIKS-crtEBIY[E396W2]. (Fig. 
36). E. CO// TGI cells transformed with this plasmid produced astaxanthln, adonixanthin and adonirubin but no zeaxan-. 
thin (Table 4). : . - . 

Plasmid pBllKS-crtEBIY[E396W]D2 has a truncated non-functional crtZ gene. Fig. 37 outlines the construction of 

35 this plasmid. The PGR reaction was run as outlined elsewhere in the text using primers crtWI 1 3/crtW1 14 and 200 ng 
of plasmid pUC18-JAPclone8 as template using 20 cycles with the following protocol: 95 *C, 45 sec/ 62 *G, 20 sec/ 72 
»G, 20sec) ... . ^- . • . . • 

primer crtW1 1 3 (5*-ATATACATATGGTGTCCCCGTTGGTGCGGGTGC-30 

40 . . 

primer crtW1 1 4 (5'-TATGGATCGGAGGCGTTGGGGGACGGGCAGAATGG-3') 

The resulting 150. bp fragment was digested with BamH\ and Nde\ and cloned into the corresponding sites of pBI- 
lSK(+)-PGRRBScrt2 resulting in the construct pBllSK(-i-)-PCRRBScrtZ-2'.TTie final plasmid carrying the genes crtE. 

45 crtB, crtl, crtY of Flavobacterium, the crtWEsge- gene of E-396 and a truncated non-functional crtZ. gene of 
Fiavobacterium was obtained by isolating.the MluMNru\ fragment (280 bp) of pBIISK(+)-PGRRBScrt2-2 and cloning it. 
into the l\4tu\fPml\ sites of plasmid pBIIKS-crtEBIY[E396WZ]. E. coli cells transformed with this plasmid produced 100% 
canthaxanthin (Table 4; "CRX": cryptoxanthin; "ASX": astaxanthin; "ADX": adonixanthin: "ZXN": zeaxanthin; "ECH": : 
echinenone; "HECH'':3-hydroxyechinenone; "GXN": canthaxanthin; "EGA": p-carotene: "ADR": adonirubin; Numbers 

so indicate the % of the individual carotenoid of the total carotenoids produced in the ceil.). 



Table 4 



plasmid 


GRX 


ASX 


ADX 


ZXN 


ECH 


HECH 


GXN 


BCA 


ADR 


pBIIKScrtEBiyZW 


1.1 


2.0 


44.2 


52.4 


<1 


<1 


<1 






pBIIKS-crtEBIY[E396WZ] 




74.4 


19,8 












5.8 
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Table 4 (continued) 



plasmid 


CRX 


ASX 


ADX 


ZXN 


ECH 


HECH 


CXN 


BCA 


ADR 


pBIIKS-crtEBIY[E396W]AZ 














100 







^ 



5 

The results of E, coli transformants carrying pBIIKScrtEBIYZW (see example 7) are also shown in Table 4 to indicate 
the dramatic effect of the new genes crtW^sgg and crtZ^sgg on the carotenoids produced in these new transformants. 

Example 9 

Cioning of the remamino crt genes of the Gram negative bacterium E-396. 

. TG1 E.. CO// transformants candying the pJAPCL544 plasmid did not produce detectable quantities of carotenoids 
(results not shown). Sequence analysis and compar-ison of the 3' (BamHl site) of the insert of plasmid pJAPCL544, to 

15 the crt cluster of Flavobacterium R1534 showed that only part of the C-terminus of the crtE gene was present. This 
result explained the lack of carotenoid production in the aforementioned transformants. To isolate the.missing N-termi- 
nal part of the gene, genomic DNA of E-396 was digested by:6 restrictions enzymes in different combinations : EcoRI, 
SamHI. PstU Sac\, Sph\ and Xba\ and transferred by the Southern blot technique to nitrocellulose. Hybridization of this 
membrane with the ^^P radio-labelled probe (a 463 bp Pst\'BamH\ fragment originating from the 3' end of the insert of 

20 pJAPCL544 (Fig. 29) highlighted a -1300 bp-long Pstl-Pstl fragment. This fragment was* isolated and cloned into the 
Pst\ site of pBSIIKS(+) resulting in plasmid pBSIIKS-#1296. The sequence of the insert is shown in Rg. 38 (small cap 
letters refer to hew'sequence obtained. Capital letters show the sequence also present in the 3' of the insert of plasmid 
pJAPCL544): The complete crtE gene has therefore a length of 882 bp (see Fig. 39) and encodes a GGPP synthase 
of 294 amino acids (Fig, 40). The crtE enzyme has 38 % identity with the crtE amino acid sequence of Erwinia herbicola 

25 and 66 % with F/ai/dt>actenL/m R 1534 WT 

Construction of 'piasmids. To have a plasmid carrying the complete crt cluster of E-396; the 4.7 kb Mlu\fBamH\ 
fragments encoding the genes crtW, crtZ, crtY. crt I and crtB was isolated from pJAPCL544 and clonedonto the 
/W/ul/BamHI sites of. pUC18-E396crtW2PGR (see example 8). The new construct was named pE396CARcrtW-B (Fig. 
41) and. lacked the N-terminus of the crtE gene. The missing C-terminal part of the crtE gene was then introduced by 

30 ligation of the aforementioned Pst\ fragment of pBIIKS-#1296 between the Pst\ sites of pE396CARcrtW-B. The result- 
ing plasmid was named pE396CARcrtW-E (Fig. 41): The carotenoid distribution of the £. coli transformants carrying 
aforementioned plasmid werei adonixanthin (65%), astaxanthin (8%) and zeaxanthin (3%). The % indicated reflects the 
proportion of the total amount of carotenoid produced in the cell. 

35 Example 10 ' 

Astaxanthin and adonixanthin production in Flavobacterium R1534 

Among bacteria Flavobacterium may represent the best source for the development of a fermentative production 
40 process for 3/?, 3/? zeaxanthin. Derivatives of Flavobacterium sp. strain R1534, obtained by classical mutagenesis 
have attracted in the past two decades wide interest for the development of a large scale fermentative production of 
zeaxanthin, although with little success. Cloning of the carotenoid biosynthesis genes of this organism, as outlined in 
example 2, may allow replacement of the classical mutagenesis approach by a more rational one, using molecular tools 
to amplify the copy number of relevant genes, deregulate their expression and eliminate bottlenecks in the carotenoid 
45 biosynthesis pathway. Furthermore, the introduction of additional heterologous genes (e.g. crtW) will result in the pro- 
duction of carotenoids normally not synthesised by this bacterium (astaxanthin, adonirubin, adonixanthin. canthaxan- 
thin, echinenone):.The construction' of such recombinant Flavobacterium R1534 strains producing astaxanthin and 
adonixanthin will be outlined below.*' ' . • 

so Gene transfer Into Flavobacterium sp. 

Plasmid transfer by conjugative mobilization. For the conjugational crosses we constructed plasmid pRSFIOIO- 
Amp^ a derivative of the small (8.9 kb) broad host range plasmid RSF1010 (IncQ incompatibility group) [Guerry et al., 
J. Bacteriol. 117:619-630 (1974)] and used E. coli S17-1 as the mobilizing strain [Priefer et al..J, Bacteriol 163:324- 
55 330 (1985)]. In general any of the IncQ plasmids (e.g. RSF1010. R300B, R1162) may be mobilized into rifampicin 
resistant Flavobacterium // the transfer functions are provided by plasmids of the lncP1 group (e.g: R1 . R751). 

Rifampicin resistant (RifO Flavobacterium R1534 cells were obtained by selection on 100 mg rifampicin/ml. One 
resistant colony was picked and a stock culture was made. The conjugation protocol was as follows: 
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Day 1: 

grow 3 ml culture of Flavobacterium R1534 W for 24 hours at 30 *C in Flavobacter medium (F-medium) (see 
example 1) 

s - grow 3 ml mobilizing E. coll strain carrying the mobilizable plasmid O/N at 37 "C in LB medium, (e.g E, coli SI 7-1 
carrying pRSF1010-Amp' or E. co// TG-1 cells carrying R751 and pRSFIOIO-AmpO 

Day 2: 

10 - pellet 1 ml of the Flavobacterium R1534 Rif^ cells and resuspend in 1ml of fresh F-medium. 
pellet 1 m! of E, coli cells (see above) and resuspend in 1 ml of LB medium. 

donor and recipient cells are then mixed in a ratio of 1 :1 and 1: 10 in an Eppendorf tube and 30 ml are then applied 
15 onto a nitrocellulose filter plated on agar plates containing F-medium and incubated O/N at 30''C.' > 

Day 3: 

the conjugational mixtures were washed off with F-medium and plated on F-medium containing 100 mg rifampicin 
20 and 1 00 mg ampicillin/ml for selection of transconjugants and inhibition of the donor celts. - * - 

Day 6-8: 

Arising clones are plated once more on F-medium containing 100 mgRif and 100 mg Amp/ml before analysis. 

25 

Plasmid transfer by electroporation. The protocol for the eletroporation is as follows: 

1. add 10 ml of O/N culture of Flavobacterium sp. R1534 into 500 ml F-medium and incubate at 30'C until 
00600=0.8-0.1 

30 

2. harvest cells by centrifugation at 4000g at 4^0 for 10 min. 

3. wash cells in equal volume of ice-cold deionized water (2 times) 
35 4. resuspend bacterial pellet in 1 ml ice-cold deionized water 

5. take 50 ml of cells for electroporation with 0.1 mg of plasmid DNA . ^ , . 

6. electroporation was done using field strengths between 15 and 25 kV/cm and 1-3 ms. 

40 

7. after electroporation cells were immediately diluted in 1 ml of F-medium and incubated for 2 hours at 30'C at 180 
rpm before plating on F> medium plates containing the respective selective antibioticum. 

Plasmid constructions: Plasmid pRSF101-Amp^ was obtained by cloning the Amp^ gene of pBR322 between the 
45 EcoR\INot\ sites of RSF1010. The Amp'^gene originateis'from p'BR322 and was isplated by PGR using primers AmpRI 
and AmpR2 as shown in Fig. 42. 

AmpRI: 

5*-TATA TCGGCCGACTAGTAAGCTT C/WW\GGATCTTCACCTAG-3' the underlined sequence contains the intro- 
50 duced restriction sites for Eag\, Spe\ and H/ndlll to facilitate subsequent constructions. 

AmpR2: ' 

5'-ATA TG/VATTCA ATAATATTGAA/W\GG/\AG-3' the underlined sequence corresponds to an Introduced EcoR\ 
restriction site to facilitate cloning into RSF1010 (see Fig. 42). 

55 

The PGR reaction mix had 10 pM of each primer (AmpRI /Amp R2). 0.5 mg plasmid pBR322 and 3.5 units of the 
TaqDNA/Pwo DNA polymerase mix. In total 35 amplification cycles were made with the profile: 95 ^'C, 45 sec; 59 **C, 45 
sec, 72 *C, 1 min. The PGR product of approx. 950 was extracted once with phenol/chloroform and precipitated with 0.3 
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M NaAcetate and 2 vol. Ethanol. The pellet was resuspended In HgO and digested with EcoRl and Eag\ O/N. The 
digestion was separated by electrophoresis and the fragment isolated from the 1% agarose gel and purified using 
GENECLEAN before ligation Into the EcoRl and Not\ sites of RSF1010. The resulting plasmid was named pRSFIOIO- 
Amp' (Fig. 42). 

5 Plasmid RSFIOIO-Ampr-crtl was obtained by Isolating the Hind\\\/Not\ fragment of pBMKS-crtEBIY[E396WZ] and 

cloning it between the H/ndlll/EagI sites of RSF1010-Amp^ (Fig. 43). The resulting plasmid RSF1010-Ampr-crt1 carries 
crtWE396. crtZE396. crtY genes and the N-terminus of the crti gene (non-functional). Plasmid RSF1010-Ampr-crt2 car- 
rying a complete crt cluster composed of the genes crtWeagg and crtZggge of E-396 and the crtY, crtl. crtB and crtE of 
Flavobacten'um R1534 was obtained by isolating the large Hind\\\/Xba\ fragment of pB!IKS-crtEBIY[E396WZ| and clon- 

10 ing it into the Spe\/Hind\\\ sites of RSE-1010-Amp' (Fig. 43). 

F/avobactenum R1534 transformants carrying either plasmid RSF1010-Amp^ Plasmid RSFIOIO-Amp'^-crtl or 
Plasmid RSF1010-Amp''-crt2 were obtained by conjugation as outlined above using E cofi Si 7-1 as mobilizing strain. 

Comparison of the carotenoid production of two Ffavobacterium transformants. Overnight cultures of the individual 
transformants were diluted into 20*ml fresh F-medium to have a final starting OD600 of 0.4.<GeIls were harvested after 

IS growing for 48 hours at 30 and carotenoid contents were analysed as outlined In example 7. Table 5 shows the result 
of the three control cultures Ffavobacterium [R1534 WT|. [R1534 WT RifR] (rifampicin resistant) and [R1534WT Rifr 
RSF1010-AmpR] (carries the RSFIOIO-Amp'' plasmid) and the two transformants [R1534 WT RSF1010-AmpR-crt1] 
and [R1534 WT RSF1010-AmpR-crt2]. Both latter transformants are able to synthesise astaxanthin and adonixanthin 
but little zeaxanthin. Most interesting is the [R1534 WT RSF1010-AmpR-crt2] Ffavobacterium transformant which pro- 

20 duces approx. 4 times more carotenoids than the R1534:WT: This Increase in total carotenoid production Is most likely 
due to the increase of the number of carotenoid biosynthesis clusters present in these cell (e.g. corresponds to the total 
copy number of plasmids In the cell). 
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Table 5 






Transformant 


carbtehoids % of total dry 
weight 


total carotenoid con- 
tent in % of dry weight 




R1534 WT 


0.039% p-Carotin 


0.06% , 


30 




0.001% p-Cryptoxanthin 
0.018% Zeaxanthin 






R1534 Rlf^ 


0.036% p-Carotin . 


0.06% 


35 




0.002% P"Cryptoxarithln 
0.022% Zeaxanthin 






R1534 Rlf^ [RSFIOIO-Ampr] 


0.021% p-Garotin 
0.002% p-Cryptoxanthin . 


0.065% 


40 




0.032% Zeaxanthin 






R1534 Rlf^ IRSF.IOIO-Ampr-crtI] 


0.022% Astaxanthin 
0.075% Adonixanthin 


0.1% 


45 




0.004% Zeaxanthin 






R1534 Rif [RSF1010-Ampr-crt2] 


0.132% p-Carotin 
0.006% Echlnenon 
0.004% Hydrpxyechinenon 


0.235% 


SO 




0.003% p-Cryptoxanthin 
0.044% Astaxanthin 
0.039% Adonixanthin 




55 




0.007% Zeaxanthin 
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SEQUENCE LISTING 



15 



(1) GENERAL. INFORMATION: 



(i) APPLICANT: 

(A) NAME: F.HOFFMANN-LA ROCHE AG 

{B) STREET: GRENZACHERSTRASSE 124 

(C) CITY: BASLE 

(D) STATE: BS 

• (E) COUNTRY: SWITZERLAND 

(F) POSTAL CODE (ZIP): CH - 4002 

(G) TELEPHONE: 061 - 688 2505 

(H) TELEFAX: 061 688 1395 

(I) TELEX: 962292/965542 hlr ch 



(ii) TITLE OF INVENTION: Improved fermentative carotenoid production 
(iii) NUMBER OF SEQUENCES: 17 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

20 (D) SOFTWARE: PatentIn Release #1.0, Version #1.30 <EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 97120324.5 

(2) INFORMATION FOR SEQ ID NO: 1: 

^5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY.: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 1: . 

35 ATGAGCGCAC ATGCCCTGCC CAAGGCAGAT CTGACCGCCA CCAGTTTGAT CGTCTCGGGC 60 

GGCATCATCG CCGCGTGGCT GGCCCTGCAT GTGCATGCGC TGTGGTTTCT GGACGCGGCG 120 

GCGCATCCCA TCCTGGCGGT CGCGAATTTC CTGGGGCTGA CCTGGCTGTC GGTCGGTCTG 180 

TTCATCATCG CGCATGACGC GATGCATGGG TCGGTCGTGC CGGGGCGCCC GCGCGCCAAT 240 

40 

GCGGCGATGG GCCAGCTTGT CCTGTGGCTG TATGCCGGAT TTTCCTGGCG CAAGATGATC 3 00 

GTCAAGCACA TGGCCCATCA TCGCCATGCC GGAACCGACG ACGACCCAGA TTTCGACCAT 360 

GGCGGCCCGG TCCGCTGGTA CGCCCGCTTC ATCGGCACCT ATTTCGGCTG GCGCGAGGGG 420 

45 CTGCTGCTGC CCGTCATCGT GACGGTCTAT GCGCTGATGT TGGGGGATCG CTGGATGTAC 480 

GTGGTCTTCT GGCCGTTGCC GTCGATCCTG GCGTCGATCC AGCTGTTCGT GTTCGGCATC 540 

TGGCTGCCGC ACCGCCCCGG CCACGACGCG TTCCCGGACC GCCACAATGC GCGGTCGTCG 600 

CGGATCAGCG ACCCCGTGTC GCTGCTGACC TGCTTTCACT TTGGCGGTTA TCATCACGAA. 660 
50 - ** 

CACCACCTGC ACCCGACGGT GCCTTGGTGG CGCCTGCCCA GCACCCGCAC CAAGGGGGAC 720 
ACCGCATGA 729 

55 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 amino acids 

(B) TYPE: amino acid 

{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: * 



Met Ser Ala His Ala Leu Pro Lys Ala Asp Leu Thr Ala Thr Ser Leu 
15 10 15 

He Val Ser Gly Gly He He Ala Ala Trp Leu Ala Leu His Val His 
20 25 30 

Ala Leu Trp Phe Leu Asp Ala Ala Ala His Pro He Leu Ala Val Ala 
35 40 45 

Asn Phe Leu Gly Leu Thr Trp Leu Ser Val Gly Leu Phe He He Ala 
50 55 60 . 

His Asp Ala Met His Gly Ser Val Val Pro Gly Arg Pro Arg. Ala Asn 
65 70 . . 75 '80 

Ala Ala Met Gly. Gin Leu val Leu Trp Leu Tyr Ala Gly Phe Ser Trp 
85 do 95 

Arg Lys Met He Val Lys His Met Ala His His Arg His. Ala Gly Thr 
100 105 110 

Asp Asp Asp Pro Asp Phe Asp His Gly Gly Pro Val Arg Trp Tyr Ala 
115 120 125 

Arg Phe He Gly Thr Tyr Phe Gly Trp Arg Glu Gly Leu Leu Leu Pro 
l-2<?' lO'S' 1'4'S' 

Val He Val Thr Val Tyr Ala Leu Met Leu Gly Asp Arg Trp Met Tyr 
Uia VVi \fin 

35 Val Vai Phe Trp Pro Leu Pro Ser He Leu Ala Ser He Gin Leu Phe 

165 170 . - 175 

Val Phe Gly He Trp Leu Pro His . Arg Pro Gly His Asp Ala Phe Pro 
180 185 190 

Asp Arg His Asn Ala Arg Ser Ser Arg He Ser Asp Pro Val Ser Leu 
195 200 205 

Leu Thr Cys.Phe His Phe Gly Gly Tyr His His Glu His His Leu His 
210 215 220 



Pro Thr Val Pro Trp Trp Arg Leu Pro Ser Thr Arg Thr Lys Gly Asp 
225 230 235 240 

Thr Ala 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 486 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA {genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGACCAATT TCCTGATCGT CGTCGCCACC GTGCTGGTGA TGGAGCTGAC GGCCTATTCC 60 

GTCCACCGCT GGATCATGCA CGGCCCCTTG GGCTGGGGCT GGCACAAGTC CCACCACGAG 120 

GAACACGACC ACGCGCTGGA AAAGAACGAC CTGTACGGCC TGGTCTTTGC GGTGATCGCC 180 

ACGGTGCTGT TCACGGTGGG CTGGATCTGG GCACCGGTCC TGTGGTGGAT CGCCTTGGGC 240 

ATGACCGTCT ACGGGCTGAT CTATTTCGTC CTGCATGACG GGCTGGTGCA TCAGCGCTGG 300 

CCGTTCCGCT ATATCCCTCG CAAGGGCTAT GCCAGACGCC TGTATCAGGC CCACCGCCTG 3 60 

CACCACGCGG TCGAGGGGCG CGACCATTGC GTCAGCTTCG GCTTCATCTA TGCGCCGCCG 420 

GTCGACAAGC TGAAGCAGGA CCTGAAGACG TCGGGCGTGC TGCGGGCCGA GGCGCAGGAG 4 80 

CGCACG 486 
20 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



IS 



25 



35 



40 



45 



<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

30 Met Thr Asn Phe Leu lie Val Val Ala Thr Val Leu Val Met Glu Leu 

1 5 • 10 15 

Thr Ala Tyr Ser Val His Arg Trp lie Met His Gly Pro Leu Gly Trp 
20 25 30 



Gly Trp His Lys Ser His His Glu Glu His Asp His Ala Leu Glu Lys 
35 40 45 

Asn Asp Leu Tyr Gly Leu Val Phe Ala Val lie Ala Thr Val Leu Phe 
50 55 60 

Thr Val Gly Trp He Trp Ala Pro Val Leu Trp Trp He Ala Leu Gly 
65 70 75 80 

Met Thr Val Tyr Gly Leu He Tyr Phe Val Leu His Asp Gly Leu Val 
85 90 95 

His Gin Arg Trp Pro Phe Arg Tyr He Pro Arg Lys Gly Tyr Ala Arg 
100 105 110 

Arg Leu Tyr Gin Ala His Arg Leu His His Ala Val Glu Gly Arg Asp 
115 120 125 

His Cys Val Ser Phe Gly Phe He Tyr Ala Pro Pro Val Asp Lys Leu 
130 135 140 

SO Lys Gin Asp Leu Lys Thr Ser Gly Val Leu Arg Ala Glu Ala Gin Glu 

145 ISO 155 160 

Arg Thr 



55 
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(2) INFORMATION FOR SEQ ID NO: 5: 

( i ) • SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA {genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGAGACGAG ACGTCAACCC GATCCACGCC ACCCTTCTGC AGACCAGACT TGAGGAGATC 60 

GCCCAGGGAT TCGGTGCCGT GTCGCAGCCG CTCGGCCCGG CCATGAGCCA TGGCGCGCTG 120 

TCGTCGGGCA AGCGTTTCCG CGGCATGCTG ATGCTGCTTG CGGCAGAAGC CTCGGGCGGG 180 

GTCTGCGACA CGATCGTCGA CGCCGCCTGC GCGGTCGAGA TGGTGCATGC CGCATCGCTG 24 0 

ATCTTCGACG ACCTGCCCTG CATGGACGAT GCCGGGCTGC GCCGCGGCCA GCCCGCGACC 300 

CATGTGGCGC ATGGCGAAAG CCGCGCCGTG CTAGGCGGCA TCGCCCTGAT CACCGAGGCG . 360 

ATGGCCCTGC TGGCCGGTGC GCGCGGCGCG TCGGGCACGG TGCGGGCGCA GCTGGTGCGG 42 0 

ATCCTGTCGC GGTCCCTGGG GCCGCAGGGC CTGTGCGCCG GCCAGGACCT GGACCTGCAC 480 

GCGGCCAAGA ACGGCGCGGG GGTCGAACAG GAACAGGACC TGAAGACCGG CGTGCTGTTC 540 

26 ATCGCCGGGC TGGAGATGCT GGCCGTGATC AAGGAGTTCG ACGCCGAGGA GCAGACTCAG 600 

ATGATCGACT TTGGCCGTCA GCTGGGCCGG GTGTTCCAGT CCTATGACGA CCTGCTGGAC 660 

GTTGTGGGCG ACCAGGCGGC GCTTGGCAAG GATACCGGTC GCGATGCGGC GGCCCCCGGC 72 0 

CCGCGGCGCG GCCTTCTGGC CGTGTCAGAC CTGCAGAACG TGTCCCGTCA CTATGAGGCC 7 80 

30 

AGCCGCGCCC AGCTGGACGC GATGCTGCGC AGCAAGCGCC TTCAGGCTCC GGAAATCGCG 840 

GCCCTGCTGG AACGGGTTCT GCCCTACGCC GCGCGCGCCT AG 882 

35 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

45 

Met Arg Arg Asp Val Asn Pro lie His Ala Thr Leu Leu Gin Thr Arg 
15 10 15 

Leu Glu Glu lie Ala Gin Gly Phe Gly Ala Val Ser Gin Pro Leu Gly 
20 25 30 

SO Pro Ala Met Ser His Gly Ala Leu Ser Ser Gly Lys Arg Phe Arg Gly 

35 40 45 

Met Leu Met Leu Leu Ala Ala Glu Ala Ser Gly Gly Val Cys Asp Thr 



55 



30 



20 



35 



45 



50 



55 



EP 0 872 554 A2 



50 55 60 

He Val Asp Ala Ala Cys Ala Val Glu Met Val His Ala Ala Ser Leu 
65 70 75 80 

lie Phe Asp Asp Leu Pro Cys Met Asp Asp Ala Gly Leu Arg Arg Gly 
85 90 95 

Gin Pro Ala Thr His Val Ala His Gly Glu Ser Arg Ala Val Leu Gly 
100 105 110 

Gly He Ala Leu He Thr Glu Ala Met Ala Leu Leu Ala Gly Ala Arg 
115 120 125 

Gly Ala Ser Gly Thr Val Arg Ala Gin Leu Val Arg He Leu Ser Arg 
130 135 140 

Ser Leu Gly Pro Gin Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 
145 150 155 160 

Ala Ala Lys Asn Gly Ala Gly Val Glu Gin Glu Gin Asp Leu Lys Thr 
165 170 175 

Gly Val Leu Phe He Ala Gly Leu Glu Met Leu Ala Val He Lys Glu 
180 185 190 

Phe Asp Ala Glu Glu Gin Thr. Gin Met He Asp Phe Gly Arg Gin Leu 
195 200 205 

Gly Arg Val Phe Gin Ser Tyr Asp Asp Leu Leu Asp Val Val Gly Asp 
210 215 220 

Gin Ala Ala Leu Gly Lys Asp Thr Gly Arg Asp Ala Ala Ala Pro Gly 
225 230 235 240 

Pro Arg Arg Gly Leu Leu Ala Val Ser Asp Leu Gin Asn Val Ser Arg 
245 250 255 

His Tyr Glu Ala Ser Arg Ala Gin- Leu Asp Ala Met Leu Arg Ser Lys 
260 265 270 

Arg Leu -Gin Ala Pro Glu He Ala Ala Leu Leu' Glu Arg Val Leu Pro 
275 280 285 

Tyr Ala Ala Arg Ala 
290 

(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 295 amino acids 

(B) TYPE: amino acid 

^" (C) STRANDEDNESS : single 

(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Thr Pro Lys Gin Gin Phe Pro Leu Arg Asp Leu Val Glu He Arg 
15 10 15 

Leu Ala Gin He Ser Gly Gin Phe Gly Val Val Ser Ala Pro Leu Gly 
20 25 30 

Ala Ala Met Ser Asp Ala Ala Leu Ser Pro Gly Lys Arg Phe Arg Ala 
35 40 ' 45., 
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Val Leu Met Leu Met Val Ala Glu Ser Ser Gly Gly Val Cys Asp Ala 
50 55 60 

Met Val Asp Ala Ala Cys Ala Val Glu Met Val His Ala Ala Ser Leu 
65 70 75 80 

lie Phe Asp Asp Met Pro Cys Met Asp Asp Ala Arg Thr Arg Arg Gly 
85 90 95 

Gin Pro Ala Thr His Val Ala His Gly Glu Gly Arg Ala Val Leu Ala 
100 105 110 

Gly He Ala Leu lie Thr Glu Ala Met Arg He Leu Gly Glu Ala Arg 
115 120 125 

Gly Ala Thr Pro Asp Gin Arg Ala Arg Leu Val Ala Ser Met Ser Arg 
130 135 140 

15 Ala Met Gly Pro Val Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 

145 150 155 160 

Ala Pro Lys Asp Ala Ala Gly He Glu Arg Glu Gin Asp Leu Lys Thr 
165 170 175 

Gly val Leu Phe Val Ala Gly Leu Glu Met Leu Ser He He Lys Gly 
20 180 185 190 

Leu Asp Lys Ala Glu Thr Glu Gin Leu Met Ala Phe Gly Arg Gin Leu 
195 200 205 

Gly Arg Val Phe Gin Ser Tyr Asp Asp Leu Leu Asp Val He Gly Asp 
210 215 220 

25 

Lys Ala Ser Thr Gly Lys Asp Thr Ala Arg Asp Thr Ala Ala Pro Gly 
225 230 235 240 

Pro Lys Gly Gly Leu Met Ala Val Gly Gin Met Gly Asp Val Ala Gin 
245 250 255 

_ His Tyr Arg Ala Ser Arg Ala Gin Leu Asp Glu Leu Met Arg Thr . Arg 

^° 260 265 270 

Leu Phe Arg Gly Gly Gin He Ala Asp Leu Leu Ala Arg. Val Leu Pro 
275 280 285 



35 



45 



SO 



His Asp He Arg Arg Ser Ala 
290 295 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ATGACGCCCA AGCAGCAATT CCCCCTACGC GATCTGGTCG AGATCAGGCT GGCGCAGATC 60 

TCGGGCCAGT TCGGCGTGGT CTCGGCCCCG CTCGGCGCGG CCATGAGCGA TGCCGCCCTG 120 

TCCCCCGGCA JUVCGCTTTCG CGCCGTGCTG ATGCTGATGG TCGCCGAAAG CTCGGGCGGG 180 

GTCTGCGATG CGATGGTCGA TGCCGCCTGC GCGGTCGAGA TGGTCCATGC CGCATCGCTG 240 

ATCTTCGACG ACATGCCCTG CATGGACGAT GCCAGGACCC GTCGCGGTCA GCCCGCCACC 300 
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CATGTCGCCC ATGGCGAGGG GCGCGCGGTG CTTGCGGGCA TCGCCCTGAT CACCGAGGCC 360 

ATGCGGATTT TGGGCGAGGC GCGCGGCGCG ACGCCGGATC AGCGCGCAAG GCTGGTCGCA 420 

TCCATGTCGC GCGCGATGGG ACCGGTGGGG CTGTGCGCAG GGCAGGATCT GGACCTGCAC 480 

GCCCCCAAGG ACGCCGCCGG GATCGAACGT GAACAGGACC TCAAGACCGG CGTGCTGTTC 540 

GTCGCGGGCC TCGAGATGCT GTCCATTATT AAGGGTCTGG ACAAGGCCGA GACCGAGCAG 600 

CTCATGGCCT TCGGGCGTCA GCTTGGTCGG GTCTTCCAGT CCTATGACGA CCTGCTGGAC 660 

GTGATCGGCG ACAAGGCCAG CACCGGCAAG GATACGGCGC GCGACACCGC CGCCCCCGGC 720 

GGAAAGGGGG- GGGTGArTGCC G<35!GGGAGAG- ATGGGGGACG. TGGCGCACCA. TTACCCCCCC 7.SSX 

AGCCGCGCGC AACTGGACGA GCTGATGCGC ACCCGGCTGT TCCGCGGGGG GCAGATCGCG 840 

GACCTGCTGG CCCGCGTGCT GCCGCATGAC ATCCGCCGCA GCGCCTAG 888 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: protein 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Thr Asp Leu Thr Ala Thr Ser Glu Ala Ala lie Ala Gin Gly Ser 
1 5 10 15 

on Gin Ser Phe Ala Gin Ala Ala Lys Leu Met Pro Pro Gly lie Arg Glu 

20 25 30 

Asp Thr Val Met Leu Tyr Ala Trp Cys Arg His Ala Asp Asp Val lie 
35 40 45 

Asp Gly Gin Val Met Gly Ser Ala Pro Glu Ala Gly Gly Asp Pro Gin 
35 50 . 55 60 

Ala Arg Leu Gly Ala Leu Arg Ala Asp Thr Leu Ala Ala Leu His Glu 
65 70 75 80 • 

Asp Gly Pro Met Ser Pro Pro Phe Ala Ala Leu Arg Gin Val. Ala Arg 
85 90 95 

Arg His Asp Phe Pro Asp Leu Trp Pro Met Asp Leu lie Glu Gly Phe 
100 105 110 

Ala Met Asp Val Ala Asp Arg Glu Tyr Arg Ser Leu Asp Asp Val Leu 
115 120 125 

4S Glu Tyr Ser Tyr His Val Ala Gly Val Val Gly Val Met Met Ala Arg 

130 135 140 

Val Met Gly Val Gin Asp Asp Ala Val Leu Asp Arg Ala Cys Asp Leu 
145 150 155 ♦ 160 

Gly Leu Ala Phe Gin Leu Thr Asn lie Ala Arg Asp Val lie Asp Asp 
50 165 170 175 

Ala Ala lie Gly Arg Cys Tyr Leu Pro Ala Asp Trp Leu Ala Glu Ala 
180 185 190 
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Gly Ala Thr Val Glu Gly Pro Val Pro Ser Asp Ala Leu Tyr Ser Val 
195 200 205 

lie lie Arg Leu Leu Asp Ala Ala Glu Pro Tyr Tyr Ala Ser Ala Arg 
^ 210 215 220 

Gin Gly Leu Pro His Leu Pro Pro Arg Cys Ala Trp Ser lie Ala Ala 
225 230 235 240 

Ala Leu Arg lie Tyr Arg Ala lie Gly Thr Arg lie Arg Gin Gly Gly 
10 245 250 255 

Pro Glu Ala Tyr Arg Gin Arg lie Ser Thr Ser Lys Ala Ala Lys lie 
260 265 270 

Gly Leu Leu Ala Arg Gly Gly Leu Asp Ala Ala Ala Ser Arg Leu Arg 
275 280 285 

75 

Gly Gly Glu lie Ser Arg Asp Gly Leu Trp Thr Arg Pro Arg Ala 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 10: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) - 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



30 



40 



SO 



ATGACCGATC 


TGACGGCGAC 


TTCCGAAGCG 


GCCATCGCGC 


AGGGTTCGCA 


AAGCTTCGCG 


60 


CAGGCGGCCA 


AGCTGATGCC 


GCCCGGCATC 


CGCGAGGATA 


CGGTCATGCT 


CTATGCCTGG 


120 


TGCAGGCATG 


CGGATGACGT 


GATCGACGGG 


CAGGTGATGG 


GTTCTGCCCC 


CGAGGCGGGC 


180 


GGCGACCCAC 


AGGCGCGGCT 


GGGGGCGCTG 


CGCGCCGACA 


CGCTGGCCGC 


GCTGCACGAG 


240 


GACGGCCCGA 


TGTCGCCGCC 


CTTCGCGGCG 


CTGCGCCAGG 


TCGCCCGGCG 


GCATGATTTC 


300 


CCGGACCTTT 


GGCCGATGGA 


CCTGATCGAG 


GGTTTCGGGA 


tGgatgtcgc 


GGATCGCGAA 


360 


TACCGCAGCC 


TGGATGACGT 


GCTGGAATAT 


TCCTACCACG 


TCGCGGGGGT 


CGTGGGCGTG 


420 


ATGATGGCGC 


GGGTGATGGG 


CGTGCAGGAC 


GATGCGGTGC 


TGGATCGCGC 


CTGCGATCTG 


480 


GGCCTTGCGT 


TCCAGCTGAC 


GAACATCGCT 


CGCGACGTGA 


TCGACGATGC 


CGCCATCGGG 


540 


CGCTGCTATC 


TGCCTGCCGA 


CTGGCTGGCC 


GAGGCGGGGG 


CGACGGTTGA 


GGGTCCGGTG 


600 


CCTTCGGACG 


CGCTCTATTC 


CGTCATCATC 


CGCCTGCTTG 


ACGCGGCCGA 


GCCCTATTAT 


660 


GCCTCGGCGC 


GGCAGGGGCT 


TCCGCATCTG 


CCGCCGCGCT 


GCGCGTGGTC 


GATCGCCGCC 


720 


GCGCTGCGTA 


TCTATCGCGC 


AATCGGGACG 


CGCATCCGGC 


AGGGTGGCCC 


CGAGGCCTAT 


780 


CGCCAGCGGA 


TCAGCACGTC 


GAAGGCTGCC 


AAGATCGGGC 


TTCTGGCGCG 


CGGAGGCTTG 


840 


GACGCGGCCG 


CATCGCGCCT 


GCGCGGCGGC 


GAAATCAGCC 


GCGACGGCCT 


gtggacccga . 


900 


CCGCGCGC 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 94 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ser Ser Ala lie Val lie Gly Ala Gly Phe Gly Gly Leu Ala Leu 
15 10 15 

Ala lie Arg Leu Gin Ser Ala Gly lie Ala Thr Thr lie Val Glu Ala 
20 25 30 

Arg Asp Lys Pro Gly Gly Arg Ala Tyr Val Trp Asn Asp Gin Gly His 
35 40 45 

Val Phe Asp Ala Gly Pro Thr Val Val Thr Asp Pro Asp Ser Leu Arg 
50 55 60 

Glu Leu Trp Ala Leu Ser Gly Gin Pro Met Glu Arg Asp Val Thr Leu 
65 70 75 80 

Leu Pro Val Ser Pro Phe Tyr Arg Leu Thr Trp Ala Asp Gly Arg Ser 
85 90 95 

Phe Glu Tyr Val Asn Asp Asp Asp Glu Leu He Arg Gin Val Ala Ser 
100 105 110 

Tibfi- ftsn. ^HQ. %1a- ©i57> V^aX %j=/i '^k^l ^zlo, ^zir^. ^ft, ^^4^ '^(.ri ^Jxsa* 

115 120 125 

Glu Glu Val Tyr Arg Glu Gly Tyr Leu Lys Leu Gly Thr Thr Pro Phe 
130 135 140 

Leu Lys Leu Gly Gin Met Leu Asn Ala Ala Pro Ala Leu Met Arg Leu 
145 150 155 160 

Gin Ala Tyr Arg Ser Val His Ser Met Val Ala Arg Phe He Gin Asp 
165 170 . 175 

Pro His Leu Arg Gin Ala Phe Ser Phe His Thr Leu Leu Val Gly Gly 
180 185 190 

Asn Pro Phe Ser Thr Ser Ser He Tyr Ala Leu He His Ala Leu Glu 
195 200 205 

Arg Arg Gly Gly Val Trp Phe Ala Lys Gly Gly Thr Asn Gin Leu Val 
210 215 220 

W*.^ 'S*''^* YxfeSL ^.&:bL. aiSi L-JKii 3>utf TxJVD ->trp Iia-xi ^Ccy Z^j^ Tilnr i«Mi Zts^^y 
225 230 235 240 

Asn Ala Arg Val Thr Arg He Asp Thr Glu Gly Asp Arg Ala Thr Gly 
245 250 255 

Val Thr Leu Leu Asp Gly Arg Gin Leu Arg Ala Asp Thr Val Ala Ser 
260 265 ■ 270 

Asn Gly Asp Val Met His Ser Tyr Arg Asp Leu Leu Gly His Thr Arg 
275 280 285 

Arg Gly Arg Thr Lys Ala Ala He Leu Asn Arg Gin Arg Trp Ser Met 
290 295 300 
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Ser Leu Phe Val Leu His Phe Gly Leu Ser Lys Arg Pro Glu Asn Leu 
305 310 315 320 

Ala His His Ser Val lie Phe Gly Pro Arg Tyr Lys Gly Leu Val Asn 
325 330 335 

Glu lie Phe Asn Gly Pro Arg Leu Pro Asp Asp Phe Ser Met Tyr Leu 
340 345 350 

His Ser Pro Cys Val Thr Asp Pro Ser Leu Ala Pro Glu Gly Met Ser 
355 360 365 

Thr His Tyr Val Leu Ala Pro Val Pro His Leu Gly Arg Ala Asp Val 
370 375 380 

Asp Trp Glu Ala Glu Ala Pro Gly Tyr Ala Glu Arg lie Phe Glu Glu 
385 390 395 400 

Leu Glu Arg Arg Ala lie Pro Asp Leu Arg Lys His Leu Thr Val Ser 
405 410 415 

Arg lie Phe Ser Pro Ala Asp Phe Ser Thr Glu Leu Ser Ala His His 
420 425 430 

Gly Ser Ala Phe Ser Val Glu Pro He Leu Thr Gin Ser Ala Trp Phe 
20 435 440 445 

Arg Pro His Asn Arg Asp Arg Ala He Pro Asn Phe Tyr He Val Gly 
450 455 460 

Ala Gly Thr His Pro Gly Ala Gly He Pro Gly Val Val Gly Ser Ala 
465 470 475 480 

Lys Ala Thr Ala Gin Val Met. Leu Ser Asp Leu Ala Val Ala 
485 490 



70 



75 



25 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



3S 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGAGTTCCG CCATCGTCAT CGGCGCAGGT TTCGGCGGGC TTGCGCTTGC CATCCGCCTG 60 

CAATCGGCCG GCATCGCGAC CACCATCGTC GAGGCCCGCG ACAAGCCCGG CGGCCGCGCC 120 

TATGTCTGGA ACGATCAGGG CCACGTCTTC GATGCAGGCC CGACGGTCGT GACCGACCCC 180 

GACAGCCTGC GAGAGCTGTG GGCCCTCAGC GGCCAACCGA TGGAGCGTGA CGTGACGCTG 240 

CTGCCGGTCT CGCCCTTCTA CCGGCTGACA TGGGCGGACG GCCGCAGCTT CGAATACGTG 300 

AACGACGACG ACGAGCTGAT CCGCCAGGTC GCCTCCTTCA ATCCCGCCGA TGTCGATGGC 360 

TATCGCCGCT TCCACGATTA CGCCGAGGAG GTCTATCGCG AGGGGTATCT GAAGCTGGGG 420 

ACCACGCCCT TCCTGAAGCT GGGCCAGATG CTGAACGCCG CGCCGGCGCT GATGCGCCTG 480 

CAGGCATACC GCTCGGTCCA CAGCATGGTG GCGCGCTTCA TCCAGGACCC GCATCTGCGG 540 

CAGGCCTTCT CGTTCCACAC GCTGCTGGTC G6CGGGAACC CGTTTTCGAC CAGCTCGATC 600 

TATGCGCTGA TCCATGCGCT GGAACGGCGC GGCGGCGTCT GGTTCGCCAA GGGCGGCACC 660 
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AACCAGCTGG 


TCGCGGGCAT 


GGTCGCCCTG 


TTCGAGCGTC 


TTGGCGGCAC 


GCTGCTGCTG 


720 


AATGCCCGCG 


TCACGCGGAT 


CGACACCGAG 


GGCGATCGCG 


CCACGGGCGT 


CACGCTGCTG 


780 


GACGGGCGGC 


AGTTGCGCGC 


GGATACGGTG 


GCCAGCAACG 


GCGACGTGAT 


GCACAGCTAT 


840 


CGCGACCTGC 


TGGGCCATAC 


CCGCCGCGGG 


CGCACCAAGG 


CCGCGATCCT 


GAACCGGCAG 


900 


CGCTGGTCGA 


TGTCGCTGTT 


CGTGCTGCAT 


TTCGGCCTGT 


CCAAGCGCCC 


CGAGAACCTG 


960 


GCCCACCACA 


GCGTCATCTT 


CGGCCCGCGC 


TACAAGGGGC 


TGGTGAACGA 


GATCTTCAAC 


1020 


GGGCCACGCC 


TGCCGGACGA 


TTTCTCGATG 


TATCTGCATT 


CGCCCTGCGT 


GACCGATCCC 


1080 


AGCCTGGCCC 


CCGAGGGGAT 


GTCCACGCAT 


TACGTCCTTG 


CGCCCGTTCC 


GCATCTGGGC 


1140 


CGCiscinrA TG*. TCUK rrGGGK 


/5GCCGKG15CC CCGGGCTKTG* CCVSKGCGCS^T CTTinSXiSlSlW 




CTGGAGCGCC 


GCGCCATCCC 


CGACCTGCGC 


AAGCACCTGA 


CCGTCAGCCG 


CATCTTCAGC 


1260 


CCCGCCGATT 


TCAGCACCGA 


ACTGTCGGCC 


CATCACGGCA 


GCGCCTTCTC 


GGTCGAGCCG 


1320 


ATCCTGACGC 


AATCCGCCTG , 


GTTCCGCCCG CATAACCGCG ACCGCGCGAT CCCGAACTTC 


1380 


TACATCGTGG 


GGGCGGGCAC 


GCATCCGGGT 


GCGGGCATCC 


CGGGTGTCGT 


TGGCAGCGCC 


1440 


AAGGCCACGG 


CGCAGGTCAT 


GCTGTCGGAC 


CTGGCCGTCG 


CA 




1482 



(2) INFORMATION FOR SEQ ID. NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: .382 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: protein 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

His Asp Leu Leu lie Ala Gly Ala Gly Leu Ser Gly Ala Leu 

5 10 " 15 

Leu Ala Val Arg Asp Arg Arg Pro Asp Ala Arg lie Val Met 
20 25 30 

Ala Arg Ser Gly Pro Ser Asp Gin His Thir Trp Ser Cys His 
35 40 45 

Asp Leu Ser Pro Glu Trp Leu Ala Arg Leu Ser Pro lie Arg 
55 60 

Glu Trp Thr Asp Gin Glu Val Ala Phe Pro Asp His Ser Arg 
70 75 80 

Thr Thr Gly Tyr Gly Ser He Glu Ala Gly Ala Leu He Gly 
85 90 95 

Gin Gly Val Asp Leu Arg Trp Asn Thr His Val Ala Thr Leu 
100 105 110 

Thr Gly Ala Thr Leu Thr Asp Gly Ser Arg He Glu Ala Ala 
115 120 125 

He Asp Ala Arg Gly Ala. Val Glu Thr Pro His Leu Thr Val- 
135 140 

Gin Lys Phe Val Gly Val Glu He Glu Thr Asp Ala Pro His 



Met Ser 

1 

35 He Ala 

Leu Asp 



Asp Thr 
40 50 

Arg Gly 
65 



Arg Leu 
Leu Leu 
Asp Asp 



45 



SO Cys Val 

130 

Gly Phe 
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145 150 155 160 

Gly Val Glu Arg Pro Met lie Met Asp Ala Thr Val Pro Gin Met Asp 
165 170 175 

5 

Gly Tyr Arg Phe lie Tyr Leu Leu Pro Phe Ser Pro Thr Arg He Leu 
180 185 190 

He Glu Asp Thr Arg Tyr Ser Asp Gly Gly Asp Leu Asp Asp Gly Ala 
195 200 205 

Leu Ala Gin Ala Ser Leu Asp Tyr Ala Ala Arg Arg Gly Trp Thr Gly 
210 215 220 

Gin Glu Met Arg Arg Glu Arg Gly He Leu Pro He Ala Leu Ala His 
225 230 235 * 240 

Asp Ala ile Gly Phe Trp Arg Asp His Ala Gin Gly Ala Val Pro Val 
15 245 250 255 

Gly Leu Gly Ala Gly Leu Phe His Pro Val Thr Gly Tyr Ser Leu Pro 
260 265 270 



20 



Tyr Ala Ala Gin Val Ala Asp Ala lie Ala Ala Arg Asp Leu Thr Thr 
275 280 285 

Ala Ser Ala Arg Arg Ala Val Arg Gly Trp Ala He Asp Arg Ala Asp 
290 295 300 

Arg Asp Arg Phe Leu Arg Leu Leu Asn Arg Met Leu Phe Arg Gly Cys 
305 310 315 320 

25 Pro Pro Asp Arg Arg Tyr Arg Leu Leu Gin Arg Phe Tyr Arg Leu Pro 

325 330 335 

Gin Pro Leu lie Glu Arg Phe Tyr Ala Gly Arg Leu Thr Leu Ala Asp 
340 345 350 

Arg Leu Arg He Val Thr Gly Arg Pro Pro He Pro Leu Ser Gin Ala 

30 355 360 365 

Val Arg Cys Leu Pro Glu Arg Pro Leu Leu Gin Glu Arg Ala 
370 375 380 

(2) INFORMATION FOR SEQ ID NO"; 14:" 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1149 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii). MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:' 14: 

ATGAGCCATG ATCTGCTGAT CGCGGGCGCG GGGCTGTCCG GTGCGCTGAT CGCGCTTGCC 60 

GTTCGCGACC GCAGACCGGA TGCGCGCATC GTGATGCTCG ACGCGCGGTC CGGCCCCTCG 120 

GACCAGCACA CCTGGTCCTG CCACGACACG GATCTTTCGC CCGAATGGCT GGCGCGCCTG 180 

TCGCCCATTC GTCGCGGCGA ATGGACGGAT CAGGAGGTCG CGTTTCCCGA CCATTCGCGC 240 

CGCCTGACGA CAGGCTATGG CTCGATCGAG GCGGGCGCGC TGATCGGGCT GCTGCAGGGT 300 

GTCGATCTGC GGTGGAATAC GCATGTCGCG ACGCTGGACG ATACCGGCGC GACGCTGACG 360 

GACGGCTCGC GGATCGAGGC TGCCTGCGTG ATCGACGCCC GTGGTGCCGT CGAGACCCCG 420 



45 



SO 
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CACCTGACCG TGGGTTTCCA GAAATTCGTG GGCGTCGAGA TCGAGACCGA CGCCCCCCAT 4 80 

GGCGTCGAGC GCCCGATGAT CATGGACGCG ACCGTTCCGC AGATGGACGG GTACCGCTTC 540 

ATCTATCTGC TGCCCTTCAG TCCCACCCGC ATCCTGATCG AGGATACGCG CTACAGCGAC 600 

GGCGGCGATC TGGACGATGG CGCGCTGGCG CAGGCGTCGC TGGACTATGC CGCCAGGCGG 6 60 

GGCTGGACCG GGCAGGAGAT GCGGCGCGAA AGGGGCATCC TGCCCATCGC GCTGGCCCAT 7 20 

GACGCCATAG GCTTCTGGCG CGACCACGCG CAGGGGGCGG TGCCGGTTGG GCTGGGGGCA 7 80 

GGGCTGTTCC ACCCCGTCAC CGGATATTCG CTGCCCTATG CCGCGCAGGT CGCGGATGCC 840 

ATCGCGGCGC GCGACCTGAC GACCGCGTCC GCCCGTCGCG CGGTGCGCGG CTGGGCCATC 900 

GATCGCGCGG ATCGCGACCG CTTCCTGCGG CTGCTGAACC GGATGCTGTT CCGCGGCTGC 960 

CCGCCCGACC GTCGCTATCG CCTGCTGCAG CGGTTCTACC GCCTGCCGCA GCCGCTGATC 1020 

GAGCGCTTCT ATGCCGGGCG CCTGACATTG GCCGACCGGC TTCGCATCGT CACCGGACGC 108 0 

CCGCCCATTC CGCTGTCGCA GGCCGTGCGC TGCCTGCCCG AACGCCCCCT GCT6CAGGAG 1140 

20 AGAGCATGA 1149 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 169 amino acids 
{B) TYPE: amino acid 
25 (C) STRANDEDNES5 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ser Thr Trp Ala Ala lie Leu Thr Val lie Leu Thr Val. Ala Ala 
1 5 10 .- . 15 

Met Glu Leu Thr Ala Tyr Ser Val His Arg Trp He Met His Gly Pro 
20 25 . 30 

35 

Leu Gly Trp Gly Trp His Lys Ser His His Asp Glu Asp His Asp His 
35 40 45 

Ala Leu. Glu Lys Asn Asp Leu Tyr Gly Val He Phe Ala Val He Ser 
50 55 60 

40 He Val Leu Phe Ala He Gly Ala Met Gly Ser Asp Leu Ala Trp Trp 

65 70 - 75 .80 

Leu Ala Val Gly Val Thr Cys Tyr Gly Leu He Tyr Tyr Phe Leu His 
85 90 95 

Asp Gly Leu Val His Gly Arg Trp Pro Phe Arg Tyr Val Pro Lys Arg 
45 1 00 105 110 

Gly Tyr Leu Arg Arg Val Tyr Gin Ala His Arg Met His His Ala Val 
115 120 125 

His Gly Arg Glu Asn Cys Val Ser Phe Gly Phe He Trp Ala Pro Ser 
130 135 140 

50 

Val Asp Ser Leu Lys Ala Glu Leu Lys Arg Ser Gly Ala Leu Leu Lys 
145 150 155 160 
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Asp Arg Glu Gly Ala Asp Arg Asn Thr 
165 

5 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 506 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

75 ATGAGCACTT GGGCCGCAAT CCTGACCGTC ATCCTGACCG TCGCCGCGAT GGAGCTGACG 60 

GCCTACTCCG TCCATCGGTG GATCATGCAT GGCCCCCTGG GCTGGGGCTG GCATAAATCG 120 

CACCACGACG AGGATCACGA CCACGCGCTC GAGAAGAACG ACCTCTATGG CGTCATCTTC 180 

GCGGTAATCT CGATCGTGCT GTTCGCGATC GGCGCGATGG GGTCGGATCT GGCCTGGTGG 240 

20 

CTGGCGGTGG GGGTCACCTG CTACGGGCTG ATCTACTATT TCCTGCATGA CGGCTTGGTG 300 

CATGGGCGCT GGCCGTTCCG CTATGTCCCC AAGCGCGGCT ATCTTCGTCG CGTCTACCAG 3 60 

GCACACAGGA TGCATCACGC GGTCCATGGC CGCGAGAACT GCGTCAGCTT CGGTTTCATC 420 

25 TGGGCGCCCT CGGTCGACAG CCTCAAGGCA GAGCTGAAAC GCTCGGGCGC GCTGCTGAAG 480 

GACCGCGAAG GGGCGGATCG CAATAC 506 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



40 



45 



SO 



ATGTCCGGTC 


GTAAACCGGG 


TACCACCGGT 


GACACCATCG 


TTAACCTGGG 


TCTGACCGCT 


60 


GCTATCCTGC 


TGTGCTGGCT 


GGTTCTGCAC 


GCTTTCACCC 


TGTGGCTGCT 


GGACGCTGCT 


120 


GCTCACCCGC 


TGCTGGCTGT 


TCTGTGCCTG 


GCTGGTCTGA 


CCTGGCTGTC 


CGTTGGTCTG 


180 


TTCATCATCG 


CTCACGACGC 


TATGCACGGT 


TCCGTTGTTC 


CGGGTCGTCC 


GCGGGCTAAC 


240 


GCTGCTATCG 


GTCAGCTGGC 


TCTGTGGCTG 


TACGCTGGTT 


TCTCCTGGCC 


GAAACTGATC 


300 


GCTAAACACA 


TGACCCACCA 


CCGTCACGCT 


GGTACCGACA 


ACGACCCGGA 


CTTCGGTCAC 


360 


GGTGGTCCGG 


TTCGTTGGTA 


CGGTTCCTTC 


GTTTCCACCT 


ACTTCGGTTG 


GCGTGAAGGT 


420 


CTGCTGCTGC 


CGGTTATCGT 


TACCACCTAC 


GCTCTGATCC 


TGGGTGACCG 


TTGGATGTAC 


480 


GTTATCTTCT 


GGCCGGTTCC 


GGCTGTTCTG 


GCTTCCATCC 


AGATCTTCGT 


TTTCGGTACC 


540 


TGGCTGCCGC 


ACCGTCCGGG 


TCACGACGAC 


TTCCCGGACC 


GTCACAACGC 


TCGTTCCACC 


600 


GGTATCGGTG 


ACCCGCTGTC 


CCTGCTGACC 


TGCTTCCACT 


TCGGTGGTTA 


CCACCACGAA 


660 
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CACCACCTGC ACCCGCACGT TCCGTGGTGG CGTCTGCCGC GTACCCGTAA AACCGGTGGT 720 
CGTGCT 726 



Claims 

1. A process for the preparation of canthaxanthin by culturing under suitable culture conditions a cell which is trans- 
formed by a DNA sequence comprising the following DNA sequences: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA 
sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crti) or a DNA 
sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA 
sequence which is substantially homologous; 

e) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E-396 (FERM BP- 
4283) [crtW£396] or a DNA sequence which is substantially homologous; 

or a cell which is transformed by a vector comprising DNA sequences specified above under a) to e) and by 
isolating carrthaxanthin from such cells or the culture medium by methods known in the art. 

2. A process for the preparation of a mixture of adonixanthin and astaxanthin or adonixanthin or astaxanthin alone by 
a process as claimed in claim 1 characterized therein that in addition to the DNA sequences specified in claim 1 
under a) to e) the following additional DNA sequence is present: 

f) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E-396 (FERM BP-4283) 
[crtZ^sgg] or a DNA sequence which is substantially homologous; 

and the DNA sequence specified under e) of claim 1 is as specified in claim 1 or the following sequence: 

g) a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crtW) or a DNA 
sequence which is substantially homologous; 

and isolating the desired mixture of adonixanthin and astaxanthin or adonixanthin or a astaxanthin alone from 
such cells of the culture medium and separating the desired mixture or carotenoids alone from other caroten- 
Olds which might be present by methods known in the art. 

3. A process for the preparation of zeaxanthin by a process as claimed in claim 1 characterized therein that the DNA 
sequence as specified under e) is replaced by the DNA sequence as specHied under f) in claim 2 and by isolating 
zeaxanthin from the cell or the culture medium and separating it from other carotenoids which might be present by 
methods known in the art. 

4. A process for the production of adonixanthin by culturing under suitable culture conditions a cell which is trans- 
formed by a DNA sequence conprising the following heterologous DNA sequences: 

a) a DNA sequence which encodes the GGPP synthase of the microorganism E-396 (FERM BP-4283) 
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[crtEEsgg] or a DNA sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of the microorganism E-396 (PERM BP-4283) 
[crtB^sge] or a DNA sequence which is substantially homologous: 

5 

c) a DNA sequence which encodes the phytoene desaturase of the microorganism E-396 (PERM BP-4283) 
[crtl^sgg] or a DNA sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of the microorganism E-396 (PERM BP-4283) 
10 [crtYE396] or a DNA sequence which is substantially homologous: 

e) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E396 (PERM BP-4283) 
[crtZE396] or a DNA sequence which is substantially homologous; and' 

75 f) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E396 (PERM BP-4283) 

[crtWEsse] or a DNA sequence which is substantially homologous; J ' / 

and isolating adonixanthin from the cell or the culture medium and separating it from other carotenoids which 
might be present by methods known in the art. • ... 

20 ' ' — • ' .* • V ::\ . ; • 

5. A process for the preparation of a food or feed composition characterized therein that after a process as claimed in 
any one of claims 1 to 4 has been effected the carotenoid or carotenoid mixture is added to food or feed. 

6. A process as claimed in any one of claims 1 to 5 characterized therein that the transformed host cell is a prokaryotic 
25 host cell, like E. coli. Bacillus or Plavobacter; .... 

7. A process as claimed in any one of claims 1 to 5 characterized therein that the transformed host cell is a eukaryo- 
tice host'Cell; like yeast or a fungal cell. 

30 



40 



45 



50 
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FiQ. 1 



crtE 



•CH^OPP 



crtB 



CHoOPP 



crtB 




I 



1 



crtY 
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Fig. 2 
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Fig. 3 
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Fig, 4 
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1 MTPKQQFPLR DLVEIRLAQI SGQFGWSAPj LCAAMSDAAL SPGKRFRAVL 
51 MLMVAESSGG VCDAMVDAAC AyEMVHAASi: iraDMPCMDD ARTRRGQPAT 
101 HVAHGEGRAV LAGIALITEA MRILGEARGA TPDQRARLVA SMSRAMGPVG 

I ■ ' : ' 

e' ■ : - . 

151 LCAGQDLDLH APKDAAGIER EQDLKTGVLE;* VAGLEML%il; KGLDKAETEQ 
201 LMAFGRQLGR VFQSYDDLLD VIGDKASTGK DTARDTAAPG PKGGLMAVGQ 
251 MGDVAQHYRA SRAQLDELMR TRLFRGGQIA DLLARVLPHD IRRSA 
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Fig. 9 

1 MTDLTATSEA AIAQGSQSFA QAAKLMPPGI REDTVMLYAW CRHAJDDVIDG 

51. QVMGSAPEAG GDPQARLGAL RADTLAALHE DGPMSPPFAA LRQVARRHDF 

101 PpLWPMDLIE GFAMDVADRS YRSLDDVLEY SYHVAGWGV MMARVMGVQD 

151 DAVXDRACDL GLAFQLTNIA RDVIDDAAIG RCYLPADWLA EAGATVEGPV 

201 PSDALYSVII RLLpAAEPYY ASARQGL.PHL PPRCAWSIAA ALRIYRAIGT 

251 RIRQGGPEAY RQRISTSKAA KIGLLARGGL DAAASRLRGG EISRDGLWTR 

301. PRA 
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1 MSSAIVIGAG FGGLALAIRL QSAGIATT^V EARDKPGGRA YVWNDQGHVF 

51 DAGPTWTDP DSLRELWALS GQPMERDVTL LPVSPFYRLT WADGRSFEYV 

101 NDDDELIRQV ASFNPADVDG YRRFHDYAEE VYREGYLKLG TTPFLKLGQM 

151 LNAAPALMRL QAYRSVHSMV ARFIQDPHLR QAFSFHTLLV GGNPFSTSSI 

201 YALIHALERR GGVWFAKGGT NQLVAGMVAL FERLGGTLLL NARVTRIDTE 

251 GDRATGVTLL DGRQLRADTV ASNGDVMHSY RDLLGHTRRG RTKAAILNRQ 

301 RWSMSLFV1.H FGLSKRPENL AHHSVIFGPR YKGLVNEIFN GPRLPDDFSM 

351 YLHSPCVTDP SLAPEGMSTH YVLAPVPHLG RADVDWEAEA PGYAERIFEE 

4 01 LERRAIPDLR KHLTVSRIFS PADFSTELSA KHGSAFSVEP ILTQSAWFRP 

451 HNRDRAIPNF YIVGAGTHPG AGIPGWGSA KATAQVMLSD LAVA 
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Fig, I X 



1 MSHDLLIAGA GLSGALIALA VRDRRPDARI VMLDARSGPS DQHTWSCHDT 



101 VDLRWNTHVA TLDDTGATLT DGSRIEAACV IDARGAVETP HLTVGF.QKFV 

151 GVEIETDAPH GVERPMIMDA TVPQMDGYRF lYLLPFSPTR ILIEDTRYSD 

201 GGDLDDGALA QASLD.Y-AARR GWTGQEMRRE RGILPIALAH , DAIGFWRDHA 

251 QGAVPVGLGA GLFHPVTGYS LPYAAQVADA lAARDLTTAS ARRAVRGWAI 

301 DRADRDRFLR LLNRMLFRGC PPDRRYRLLQ RFYRLPQPLI ERFYAGRLTL 

351 ADRLRIVTGR PPIPLSQAVR CLPERPLLQE RA 



51 



DLSPEWLARL SPIRRGEWTD 



QEVAFPDHSR 



RLTTGYGSIE AGALIGLLQG 
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Fig>12 

1 MSTWAAILTV ILTVAAMELT AYSVHRWIMH GPLGWGWHKS HHDEDHDHAL 
51 EKWDLYGVIF AVISIVLFAI GAMGSDLAWW LAVGVTCYGL IYYFlhdGLV 

loi hgrwpfryvp krgylrrvyq ahrmhhavhg rencvsfgfi -wapsvdslka 

151 ELKRSGALLK DREGADRNT 
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FlQ. 14 

I — crt£ 

#100: 5 • rarar arr?rrr Jacaocacaaal cca cgrATG ACGCCCAAGCAGCAGCAArTC 3 ' 
Spcl RBS ^<^^ 



#101- 5 • TATATACiXiaGG-CAGCCGCGACGGCCTGTGG 3 ' 

SmU ...1 

j — ^ crtZ 

#104. 5 • i"«t-^r jrTAg>rrrjacagqagaaak t:ar:az2ZSAGCAC7TGGGCCGCAATCC 3 ' 
BcoRl . RBS ^^'^ 

: , I 

#105: 5* GTTTCAGCTCTGCCTTGAGGC. 3' 

■ • t ■ ' 

era— *T p-^ crtY . ' 

MUTl: 5 • GCGAAGGGGCGGATCGCAATAC c r:!:^aaaga ooa^pircTrqATGAGCCA>TGATCTGCTGA.TCG- 3 ' 

: PmU : . • 

MVP- 5* cgcccCTGCTGCAGGAGAGAGC t T daaaaqaaq r a r ff aaATGAGTTCCGCCATCGTCATCG 3* 

Muni - - : ; . ' i 

• t ^ . • - .. i ■ 

; crti — j-r*- crtB \ 

Kom: 5 • GGTCATGCTGTCGGACCTGGCCGTCGC tZ%aigoi^^^aa^cATGACCGAT * 

. BaxnHl ^ ; 



MUT5: 5 * ^TXTHTr-r ri^ r qccC ccctd caaGCTCTCTCCTGCAGCAGGG 3 ' 
Mnil X* — crtY 



MUT6:5' atgaccflnaicc^ccttr^aGCGACGGCCAGGTCCGACAGC 3' 
BaxnHl 1 m crtI i 



CARH 5 • CAGAACCCATCACCTGCCCGTC 3 ' 



5 • CGCGAiilCTCGCCGGCAATAGTTACC 3 • 
EcoRl 



ai<- 5 • CTr&riiTr:r&T ff ATPT TAC fS^nr'rr ATXAfir.ATGTK^rPTrTTCAACTAACGGGGCAGG ' 3 * 
Sphl SacI AaiD 
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Fig, ?QM 
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Fig- 20/2 
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Fipr. 24/1 



CTAAATTGTAACCGTTAArATTTTGTrAAAATtCGCGTTAAATTTTTGTTAAATCAGCTC 

1 + : — " + 60 

GATTTAACAT?CGCAArTATAAAACAAT7TT;cAGCGCX\TTTAAAAACAATTTAGTC« 

ArrTTrTAACCAATAGCCCGAAATCGGCAAAATCCCTiArAAATCAAAAGAATAGA 

61 T—J- * 1- ^ X20 

TAAAAAATTGGTTArcCGGCTrrAGCCGTTTTACGGAATATTTACTTTTCTTATCTGGCT 

GATAGGGTTGAG7GTTGTTCCAGTTTGGAACAAGACTCCACTATTAAAGAACGTGGACTC 

121 -:r-^'*- — — ^ + 180 

CTATCCCAACTCACAACAAGGTCAAACCTTG7TCTCAGGTGATAATTTCTTGCACCTGAG 

CAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACC 

lai + — r ^ »■ 240 

GTTGCACTTTCCCGCTTTTTGGCAGATACTCCCGCTACCCGCTGATGCACTTGGTAGTGG 

CTAATCAACrrrTTTTGGGGTCGAGGTGCCGTAAiGCACT;^ 

241 -> + — r +- — ^ : h 300 

.t GATTAGTTCAAAAAACCCCAGCTCCACGGCATrrCGTGATTTAGCCTTGGGATTTCCCTC 

CCCCCGATCTAGAGCTTGACGGGGAAA GCCGGCGAACGTCGCGAGAAAGGAACGGAAGAA 

301 '—-i- ^-^T-..-- . -r -r-' + 1— -r 360 

GGGGGCrA^TCTCGAACTGCCCCTTTCGGCCGCTTGCACCGCTCTTTCCTTCCCTTC^ 

AGCGAAAGGAGCGGGCGCtACkKSCGCTGGCJVAGTGTAGCGGTCACGCTGCGCCTAACCAC 

361 ^-S---^ r T-t^— ; r+ 420 

TCGCTTTCCTCGCCCCCGATCCCCCGACCGTTCACATCGCCAGTCCGACGCGCAT^ 

CACACCCGCttCGCTTAATCCCCCGCTACACCGCGCCTCCCATTCGCCArrCAGG 

421 -i ^ + 4 80- 

GTCTGKXJCGGCGCGAATTACGCGGCGATGTCCCGCGCAGGGTAAGCGGTAAGT?^ . ft 

CJU^CTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGG * 

4 81^^ • ^ — ■ -r- J. =— i K 540 

GTTGACAACCCTTCCCGCTACCCACCCCCGGAGAAGCGATAATGCGGTCGACCGC 

GGGATCrTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTG 

541 — : — — ~+ 600 

V CCCTACACGACCTTCCGCTAArre?J^CCCATfGCGGTCCCAAAAGGGTCAGTGCT 

TAAAACGAeGGCCAGJCAGCGCGbGTAArACGACTCACTATAGGGCG;^ 
601 ^^^^J: L , — + 660 

ATTTTGCTGCCGGTCACTCGCGCGCATTATGCTGAGTGATArCCCGCTTAACCTCGAGGT 

CCGCGGTGGCGGCCGCTCTAGTGGATCCGCGCCTGGCCGTTCGCGATCAGCAGCCGCCCT 

6 61 — r + + ^ 5- »• 7 20 

GGCGCCACCGCCGGCGAGATCACCTAGCCGCGGACCGGCAAGCGCTAGTCGTCGGCGGGA 

TGCGGATCGGTCAGCATCATCCCCATGAACCGCAGCGCACGACGCAGCGCGCGCCCCAGA 

7 21 -r . ^ 780 

ACGCCTAGCCACTCGTAGTAGGGGTACTTGGCGTCGCGTGCTGCGTCGCGCGCCGGGTCT 

rCGGCCCCGTCCAGCACGGCA?GCGCCATCATCGCGAAGGCCCCCGGC(GGCATGGGGCGC 

7 81 * .---1.-^ — . ^ : ^ ^ 840 

AGCCCGCCCAGGTCGTGCCGTACGCGGtAGTACCCCTTCCGGGGGCCGCCGTACCCCGCG 

GTGCCCATTCCGAAGAACTCGCAGCCTGTCCGC7CCCCAACGTCGCGCCAGATCGCGCCG 

841 4. * ^-r — s- 900 

CACCGCTAACGCTTCTTGAGCGTCGGACAGGCGACGCCTTCCAGCGCJ3GTCTAGCGCGGC 

TATTCCCATGCAGTCACGGCCCCCATGCGCGTCGGCCCGCCCTGCCCCCCCCCCACCAGC * 
901 ' r- — : 4. 960 

ATAAGGC7ACGTCAC7GCCCGGGCTACGCCCACCCCCCCGGGACGGCGCGCCCGTGGTCG 
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Fipr. 24/2 



GCATCGCGCACGAACCCTTCCCACATGATCTCCTGATCCATGCCCCGTCATTGCAAAACC 

9 fix ♦ * 1 -r : T 1020 

CCTAGCCCCTGCTTGGGAAGGCTCTACTACACGACTAGGXACCGGGCAGTAACGTTTTGG 

GATCACCGATCCTGTCGCGTGATGGCATTGTTTGCAArGCCCCGAGGGCTAGGATGGCCC 

1021 i- + + -i- ► 1080 

CTAGTCCCTAGGACAGCGCACTACCC7AACAAACGTTACGGGGCTCCCGATCCTACCGCC 

GAAGGATCAAGGGGGGGAGAGACATGGAAATCGAGGGACGGGTCTTTGTCGTCACGGGCG 

CTTCCTAGTTCCCCCCCTCTCTCTACCTTTAGCTCCCTGCCCAGAAACAGCAGTGCCCGC 

CCGCA7CGGGTCTGGGGGCGGCCTCGGCGCGGATGCTGGCCCAAGGCGGCGCGAAGGTCG ~ - 

1141 • + + ^ 1— 1- 1-200 

GGCGXAGCCCAGACCCCCGCCGGAGCCGCGCCTACGACCGGGTTCCCCCCCCCTTCCAGC : 

TGCTGGCCGATCTGGCGGAACCGAAGGACGCGCCCGAAGGCt^GTTCACGCGGCCTGCG 
1201 + i + —J i- 1260 

acgacccgctagaccgccttgccttcctgcgcgggcttcxcccccXactgcxccgga 

ACGTGACCGACGCGACCGGTCCGCAGACGGCCATCCCGCTGGCGACCGACCGGTTCGGCA 

1261 ■ * ^ K 1 ' ► 1320 

TGCACTGGCTGCGCTGCCGACGCGTCTGCCGGTAGCGCGACCGCTGGCTGGCGAAGCGGT - • 

ggctcgacggccttgtgaactgcgcgggcatccccccggccgaacggatgctgggccgcg 

1321 —^——-4. — ^ ► 1. !• 1380 

ccgacctgccc»saacacttgacgcgcccgtagcgcggccggcttgcctacga 
aqgggccgcatggactcksacagcittgcccgtgcggtcacgatcj^ 

1381 -J^, ^ ^ — i- — , ^ 1440 

tgcccgccgtXcctgacctgtcgaaacgcgcacgcc^ • 
tcaacatggcccgccttgcagccgaggcgatggcccgg^ 

1441 ± -4=--- ' -^-^ 1. '-^- ^ - - : i- 1500 

ACTTCfACCGiSGCGGAACGTCTCCTCCG^ • - • 

GTCGCOTtyVTCGTCAACAO^ 

1501 -4.-;— ^ 1 1. .-^4. u — 1 — ► 1560 

CACXGCACtAGCAGTTGTGCCGGAGCrrAGCGCCGCGTCCTGCCT ' ^ 

CCTATCCGGCCA'aiXXGGCCGiGCCTGGCCiSGCATGA • 

1561 ^ 1. ► : . = 1 H 1620 

GGATACCkrCCOTCGTTCCGCCCGCXCCCCCCGTACTCk: 

^ CGCGGCACGGCAtCCGCGTCATGACCATGGCGCCCGGCATC^ • - 

162l'- '■ j.^-^ -^—^ ^— H 1680 

CCGCCGTG'cCGTAGGCGCAGTAeXGGTAGCGCGGGCCGTAC^ •- • : . 

AGGGGCTCCCGCAGGACGTtCA'GGACAGCCTCGGCGCG(^GGTGCGCT 

1681 ' — : — ~ ^- --r- ^— ^ i- 1740 

TCCCCGACGGCGTCCTGCAACfCCTGTCGGACCCGCGCCGCCACGGGAAC^ - 

TGGGACAcbciKCGGAATACGCGGCCCTGTt^ 
1741 — ~ ^Ti--;^-- ; — + 1800 

ACCCTCTCCCCAGCCTTAtCCGCCGCGA<^CG^ - 

„ ACGGAGAGGTCATCCGCCTCGACCtoGCATTGCGCATGGCCCCCAAGTGAiV 

1801 ~ — i;- ^~-+- ---+- : — 4.- .ju^-z.^^ : i. I960 

TCCCf CTCC?i<rr AGGCCXACCTCCCGCCTAACGCGTACCGGGGGTTC^ - 

CATGGACCCCATarrCATCACCGGCGCGATGCGCACCCCGATGGGCGCAT^^ 
1861 • — f "'—r --4.-. ■ : r — -^u-^i^ 192 0 

CTACCTGGGCTAGCAGTAGTGGCCGCGCTACCCGTGGGGCTACCCCCGfi^CCtCCCCCT 

TCTTCCCCCCAfGGATGCCCCCACCCTtGGCCCCCACGCCATCCCCGCCGTC ' * 

192i ♦ ♦ • + — -+ 1980 

ACAACGCCCCTACCTACCGCCCTGGCAACCGCGCCTGCGC-AGGCGCGCCCCCACTTGCC - 
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CCTGTCCCCCGACATGGTGGACGAGG7GCTCATGGGCTGCGTCCTCGCCGCGGGCCAGGG 

1981 -T ^ ^ -i- + ^ 2040 

GGACAGCCGGCTGTACCACCTGCTCCACGACTACCCGACGCACGAGCGGCGCCCCCTCCC 

TCAGGCACCGGCACGTCAGGCGGCGCTTGGCGCCGGACTGCCGCTGTCGACGGGCACGAC 

2041 -r + + ^ H 2100 

AGTCCG7GGCCGTCCAGTCCGCCGCGAACCGCGGCCTGACGGCGACAGCTGCCCGTGCTG 

CACCATCAACGAGATGTGCGGATCGGGCATGAAGGCCGCGATGCTGGGCCATGACCTGAT 

2101 — rr— ^ — -5- ■+ ^ + + 2160 

G7GGXAGTTGCTCTACACGCCTACCCCG7ACTTCCCGCGCTACGACCCGGTACTGGACTA . 

CGCCGCGCCATCGCCGCGCATCCTCGTCGCCGGCGGGATCGAGAGCATGTCGAACGCCCC 

2161 -r-rr- *- ^+-^— ^ , + , .-^^-+ 2220 

GCGGCGCCCTAGCCGCCCGTAGa^GCAGCGGCCGCCCTACCTCTCGTACAGCTTGCGGGG • 

CTACCTGCTGCeCAAGGCGCtKSTCGGGGATGCGCATGGGCCATGACCGTGTGCTGGATCA - 

2221 — r-. K -f-— + J. K 2280 

GATGGACGACGGGTTCCGCGCCAGCCCCTACGCGTACCCGGTACTGGCACACGACCTAGT 

CATGTTCCTCCACGGGTTGGAGGACGCCTATGACAAGGGCCGCCTGATGGGCACCTTCGC 

2281 -rr-r + — + K 1. — — . 2340 

GTACAAGGAGCTGCCCAACCTCCrGCGGATACTGTTCCCGGCGGACTACCCGTCK3AA^ , - 

CGAGGArrCGGCCGGCGATCACCGTTTCACCCGCGAGGCGOCGGACGACTATCqGCTGAC • 

2341 ~ + ^-4.-^^ : 1. + , h — ^. — ^. « '„ .2400 

GCTCCTAACGCCGCCCCTAGTGCCAAAGTGGGCGCTCCGCGTCCTGCTGATACGCGACTG : 

CAGCCTGCCCCGCGCGCAGGAGGCCATCa:.CACCGGTGCCTTqGCCG^ 

2401 i •■ ^ — —--4: 24 60 

GTCGGACCCGGCGCGCCrrCCTGCGGTAGCGGT^^ '-.^ 

CGTGACCGTCACOGCACXSCAAGGTGCAGACCACCGTCGATACCGACGAGATGCCCGGCAA 

24€1.~T + — >. — 2520 , 

GCACTGGCACTCCCGTGCGTTCCACGTCTGGTGGCA y 

GGCCCCCCCCCAGAACATCCCCCATCTGAAGCCCGCCTTCCCTGACGGTGGCACGGTCAC - 

2S2i. ~— h : — — - — 1. : h — --ii::,^:^. 2580 . 

CCGGGCCGGGCTCTTCTAGCKXKTrAGACTTCGCqCCGAAGGC^ 

GGCGGCCAACAGCTCGTCGATCTCGGACGGGGCGGCGGCGCTCGTGATGATGCGCGAG^ • 

2581 . -~- + ' — K +««:i._41«^ _2"640 , 

CCGCCGCTTGTCGAGCAGCTAGAGCGTGCGCCGCCGCCGCGACCACTACTACGC ,. ^- 

GCAGGCCGAGAAGCTGGGeCTCACGCCGATCGCGCGGATCATCGGTCATGCGACCC^ 

2 641 -TTT— ■■ — +^^-5^r-i=—- »-T + +^-^-^ .2700 

CGTCCGGC.TCTTCGACCCGGACTGCGGCTAGCGCGCCTAGTA .• 

CGACCGTCCCGGCCTGTTGCCGACGGCCCCCATCGGGGGGATGCGCAAGCTGCTGGACCG - 

2701 ---J-— — — ----H.-------=---^»---~ — "rrit ^ISO 

GCTGGCAGGGCCPPAGAAGGGCTGCCGGGGCTAGCCCCGCTACGCGT^ . . 

CACGGACACCCGCCTTGGCGATTACGACCTGTTCGAGGTGAACGAGGCATTCGCCGTCGrr^ ... 

2761 r— i- 

GTGCCTGTCGGC.CGAACCGCTAATCC^GGAGAAGCTCCACTT 

CGCCATCAT.CCCGATGX^CGAGCTTGGCCTGC^^ 

GCGGTACTAGCGCTACTT'CCTCGiVACCGGACpCTCTC^ 

GGCCTGCCCGCTTGGGCATCCC ATCCCC GCGTCGGCCGCGCGGATC ATGGTC ACGCTGCT 

2881 — ~^+r'r:^J^^~*r—^r'—^''^'r'—'s"*-:-:-^—~^'*^'^ + 2 94 0 

CCGGACCCCCGAACCCGTAGGGTACCCCCGCAGCCCCCGCCCCTA^ 

GAACCCCATGGCGCCCCGCCCCGCGACGCCCGGGGCCGCATCCGTCTGCATCCGCCCGGG 

294 1 r , + -r + 3000 

CTTCCGCTACCGCCGCGCCCCGCGCTGCGCGCCCCGGCCTAGGCAGACCTAGCCGCCCCC 
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CGAGGCGACGGCCArCCCGCTGGAACGGCTGAGCTAATTCATTTGCGCGAATCCGCGTTT 
3001 + ^ ^ : + + ; * 

GCTCCCCTGCCGw i AGCGCGACCTTGCCGACTCGATTAAGTAAACGCGCTTAGGCCCAAA 

TTCGTGCACCATGGGGGAACCCCAAACGGCCACGCCTGTTGTGGTTGCGTCGACCTCrC 
3061 + + ^ -. ; 3^20 

AAGCACGTGCTACCCCCTTCGCCTTTGCCGGTCCGGACAACACCAACGCACCTGGACAGA 

AGCCCGGTACGGGCACTGCGC7ACACCGTCCGCGTACCCCGCAACGGCTAGGCCACCGTA ^ ' ^ ^ 

GACTGACgCAACGAAGGCACCGATGAC3CCCAAG<^GCAATTCCgegT;igGgf:;;»XaTfyri ' 
3181 ^ ^ , 

CTGACTGCGTTGCTTCCSTGGCTACTGCGGG7TCGTCGTTAAGGGGGATGCGCTAGACCA • 

CCAGATCAGGCTGGCGCAGATCTCGGGCCAGTTCGGCGTGGTCTCGGCCCCCCTCGGCGC 
3241 + — + ^ ^ ^ ^ 33QQ 

GCTCTAGTCCGACCGCGTGTAGAGCCCGGTCAACCCGCACCAGAGCCGGGGCGAGCCGCG 

GGCCATGAGCGATGCCGCCCTGTCCCCCCGCAAACGCTTTCGCGCCCTGCTGATGCTGAT 
3301 — * * ^ , , , 33gQ 

CCGGTACTCCCTACGGCGGGACAGGGGGCCGTTTGCGAAAGCGCGGCACGACTACGACTA 

GGTCGCCGAAAGCTCGGGCGGGGTCTGCGATGCGATGGTCGATGCCGCCTGCCCGGTCGA 
3361 — ^ ^ ^ , 3420 

CCAGCCCCTTTCGAGCCCGCCCCAGACGCTACGCTACCAGCTACGGCGGACGCGCCAGCT 

GATCGTCCATCCCGCATCGCTGATCTTCGACGACATGCCCTGCATGGACGATCCCAGGAC 
3421 ~ — + X , ^ ^ 3.4.^. 

CTACCAGGTACGGCGTAGCCACTAGAAGCTGCTGTACGGGACGTACCTGerACCGTCCTG 

CCCTCGCCGTCAGCCCGCCACCCATGTCGCCCATGGCGAGGGGCGCGCGGTGCTTGCGGG 
3481 * ^-^--^ + ^ , ^ 334 Q 

GGCAGCGCCAG7CGGGCGGTGGGTACAGCGGGTACCGCTCCCCGCGCGCCACGAACGCCC 

CATCGCCCTGATCACCGAGGCCATGCGGATTTTCGGCCAGGCGCCCGGCGCGACGCCGGA 
3S41-- + 4. + ^ ^ 3gQQ 

GTAGCCGCACTACTGGCrcCGGTACGCCTAAAACCCGCTCCGCGCGCCGCGCTGCGGCCT 



TCAGCGCGCAACGCTGGTCCCATCCATCTCGCCCGCGATGGGACCGGTGGGGCTGTCCGC / ' . 

3660 



3 601 + h + ^. 



AGTCGCGCGTTCCGACCAGCCTAGGTACAGCGCGCCCTACCCTGGCCACCCCGACACGCG ; 

3 6 61 + + ^ ^ ^ + -3 720 

TCCCGTCCTAGACCTGGACGTGCGGCGGTTCCTGCGGCGGCCCTAGCTTGCACTTGTCCT 

CCTCAAGACCGGCGTGCTGTTCGTCGCGGGCCTCGAGATGCTGTCCATTATTAAGGGTCT 
3721 ^ X ^ ^ , ^ 3,3Q 

GGAGTTCTGGCCGCACGACAAGCAGCQCCCCGAGCTCTACGACAGGTAATAATTCCCAGA 

CGACAAGGCCGAGACCGAGCACCTCATGGCCTTCGCGCGTCAGCTTGGTCGGGTCTTCCA 
3781 X + . , , , 3340 

CCTGTTCCCCCTCTGGCTCCTCGACTACCGGAAGCCCGCAGTCGAACCAGCCCAGAAGGT 

GTCCTATGACGACCTGCTGCACGTGATCGGCGACAAGGCCAGCACCGGCAAGGATACGGC . . 
3841 -r ^ — ^ ^ J. ^ ^ 3gQo 

CAGGATACTGCTCGACGACCTGCACTACCCGCTGTTCCGGTCGTGGCCGTTCCTATGCCG 



GCGCGACACCGCCGCCCCCGCqcCAAAGGGCGGCCTGATGCCGGTCGGACAGATGGGCGA 

CGCCCTCTCGCCGCGGGGGCCGGGTTTCCCCCCGGACTACCCCCAGCCTGTCTACCCGCT 

CGTGGCCCAGCATTACCGCGCCAGCCGCGCGCAACTCGACGACCTGATGCGCACCCGCCT 
--------- + 

CCACCGCGTCCTAATCCCGCGGTCGCCCCGCGTTGACCTGCTCGACTACGCGTGCCCCCA 
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CTTCCCCGGCGCGCAGATCGCC-GACCTGCTGCCCCGCGTGCTGCCGCATGACATCCGCCG 

4021 + -r ■!- + 4080 

CAAGGCGCCCCCCGTCTACCCCCTGCACGACCGCGCGCACGACGCCGTACTGTAGGCGGC 

CAGCGCCTACCCGCGCGG7CGGGTCCACAGGCCGTCGCCCCTGATTTCCCCGCCGCGCAG 
4 081 — — — — — ■*■ — !• -i-— -——4. 4140 

gtcgccgatccgcgccccagcccaggtgtccggcagcgccgactaaagcggcggcgcgtc 
gcgcgatgcggccgcgtcc:aagcctccgcgcgccagaagcccgatcttggcagccttcga 

4141 -i- + -i- -!» i. 4 200 

CGCGCTACCCCGCCGCAGG7TCGGAGGCGCGCGGTCTTCGGGCTAGAACCGTCGGAAGCT 

CGTCCTGATCCGCTGGCGATACGCCTCGGGGCCACCCTGCCGGATGCGCGTCCCGATtGC 

4 201 ^— ► -»• K 4 2 60 

GCACGACTAGCCGACCGCTATCCGGAGCCCCGGTGGGACGGCCTACGCGCAGGGCTAACG 

GCGATAGATACGCAGCGCGGCGGCGATCGACCACGCGCAGCGCGGCGGCAGATGCGGAAG 

4261 *- ♦ + + 1— 1- 4 320 

CGCTATCTATGCGTCGCGCCGCCGCT AGCTGGTGCGCCTCGCGCCGCCGTCTACGCCTTC ' 

CCCCTGCCGCGCCGAGGCATAATAGGGCTCGGCCGCGTCAAGCAGGCGGATGATGACGGA 

4321 K -r * K h 4380 

GGGGACGGCGCGGCTCCGTATTATCCCGAGCCGGCGCAGTTCGTCCGCCTACTACTGCCT " 

ATAGAGCGCGTCCGAAGGCACCGGACCCTCAACCGTCGCCCCCGCCTCGGCCAGCCAGTC - 

4 381 + + -r 1 1- 4 4 40 

TATCTCGCGCAGGCTTCCGTGGCCTGGGAGTTGGCAGCGGGGGCGGAGCCGGTCGGTCAG - 

GGCAGGCAGATAGCAGCGCCCGATGGCGGCATCGTCGATCACGTCGCGAGCGA7GTTCGT 

4 441 * + + ' -t— 4S00 

CCGTCCGTCTATCGTCGCGGGCTACCGCCGTACCAGCTAGTGCAGCGCTCCCTACAACCA 

CAGCTGGAACGCAAGGCCCAGATCGCAGGCGCGATCCAGCACCGCATCGTCCTGCACGCC : 

4 501 -r * + r ! h-45S0 

GTCGACCTTGCCTTCCGGGTCTAGCGTCCGCGCTAGGTCGTGGCGTACCAGGACGTCCGG 

CATCACCCGCGCCATC ATCACGCCCACGACCCCCGCGACGTGGTAGGAATATTCCAGCAC 

4561 *• + + »■ 1- 4 620 

GTACTGCGCCCGGTAGTAGTGCGGCTGCTGCCGGCGCTGCACCATCCTTAXAAGGTCCTG * 

GTCATCCAGGCTGCGGTATTCGCGATCCGCGACATCCATCGCGAAACCCTCGATCXGGTC 

4 621 »• + r »• *— 1- 4 680 

CACTACGTCCGACGCCATAAGCGCTACGCGCTCTAGGTAGCGCTTTGGGAGCTAGTCCAG 

CATCGGCCAAAGGTCCGGGAAATCATGCCGCCGGGCGACCTGGCGCAGCGCCGCGAAGGG 

4 681 + > -t- + + h 4 740 

GTAGCCGGTTTCCAGGCCCTTTAGTACGGCGGCCCGCTGGACCGCGTCGCGGCGCTTCCC 

CGGCGACATCGGGCCGTCCTCGTGCAGCGCGGCGAGCGTCTCGGCGCGCAGCGCCCCCAG 

4 741 -i- + + k 4800 ' 

GCCGCTGTAGCCCGGGAGGAGCACCTCGCGCCGG7CGCACAGCCCCGCGTCGCGGGGGTC 

CCGCGCCTGTGGGTCGCCGCCCGCCTCGGGCGCAGAACCCATCACCTGCCCGTCGATCAC 

4 801 * * ' + !■ 4 860 

GGCGCGGACACCCAGCGGCGGGCGCAGCCCCCGTCT.7GGGTAGTGGACGG.GCAGCTAGTG" 

CTCATCCGCATGCCTGCACCAGGCATAGAGCATGACCGTATCCTCGCGGATGCCGCGOGG 

4861 + + * * -I- h 4920 

CAGTAGGCGTACGGACGTGGTCCGTATCTCGTACTGGCATACkjACkrGGCTACGGCCCGCC 

CATCAGCTTGGCCGCCTGCGCGAAGCTTTCCCAACCCTGCCCGATGGCCGCTTCGGAAGT- 

4 921 * ■»- 4 980 - 

CTAGTCGAACCGCCGGACGCGCTTCCAAACGCTTGGGACGCCCTACCGGCGAACCCTTCA 

CGCCGTCAGATCGGTCATCCGACGGCCAGGTGCCACAGCATGACCTGCGCCGTGGCCTTG 

4981 T ♦ + ^ * ^ 5040 

CCCGCAGTCTAGCCAG7ACCCTGCCCC7CCAGCC7G7CGTACTGGACGCCGCACCCGAAC 
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GCGCTGCCAACCACACCCCGGATGCCCGCACCCGGA7GCG7GCCCGCCCCCACCATGTAG 
504 L * . ^ , 

CGCGACGGT7GCTCTGGGCCCTACGCGCGTCGGCCTACGCACGGGCGGGGGTGCTACATC 

XAGTTCGGGATCGCGCGGTCGCGGTTATGCGGGCGGAACCAGGCCGATTGCGTCAGGATC 
SlOl z — ^-r— — — + — ; « ^ J. 5j^gQ 

TTCAAGCCCTAGCGCGCCAGCGCCAATACGCCCGCCTTGGTCCGCCTAACGCAGTCCTAG 
^ GGCTCGACGGAGAAGGCGCTGCCGTGATGGGCCGACAGTTCGGTGCTGAAATCGGCGGGG 
CCGAGCTCGCTCTTCCGCGACGGCACTACCCGGCTGTCAAGCCACGACTTTAGCCGCCCC 

5221 

GACTTCTACGCCGACTGCO^GTCCACGAACGCGTCCAGCCCCTACG 5280 



TCCTCGAAGATGCGCTCGGCATAGCCCGGGGCCTCGGCTTCCCAATCGACATCGGCCCCG 
S281 .^-^--u-— :-rr-T^ S340 

AGG3VGCTTCTACGCGAGCCGTATCGGGCCCCGGAGCCGAAGGGTTAGCTGTAGCCGCGCC 

CCCAGATGCGGAACGGGCGCAAGGACGTAATGCGTGGACATCCCCfcGGG^ 
S34 I — -^-^.^..^.^ _^ , ^ 3^^^ 

gcotctacgccttccccccc^ 

ggatcggtcacgcagggcgaatgcagatacXtcgagaaatcgtccggcagc^ 

S401 5460 

CCrAGCCACTGCGTCCCGCTTACGTCTATCT 

TTGAAGATCTCGTTCACCACCCCCTtGTAGCGCCGGCCC^ 

AACrTCTACAGCAACTGGTCOGGGAACATCGCGCCCGGCTTCTACT 

AGGTtCTCGCGGCCCTTCGACACGCCCAAJKTGCAGCAC^ 
5521 *-.,_„-^^__.^ ssao 

TCCAAGAGCCCCCCGAACCTGTCCGGCTTTACGTCGTCCTTGTCGCTGTAGCTGCTCGCG 

TCCCGGtrCAOTATCCCGGCCTtGGTGCGCCCG^ 

5581 + + H K K 5640 

ACGCCCAACTCCTAGCGCCGGAACCACGCGCk;CGCCCk:CCAfACCGGGTCGTCCAG^ - 

TAGCTGTGCATCACGTCGCCCTTCCTGGCCACC(3tATCCGCCCCCAACf GCCG^ 
5641 + , ^ + ..^ ^ 

ATCGACACGTAGTGCAGCGGCAACCACCCGTGGCATAGGCGCGCGTTGACGGCGGGCAGG - -'^ * 

AGCAGCGTGACGCCCGTGGCGCPATCGCCCTCGGTGTCGATGCGCGTGACGCGGGCATTC * 
5701 * . ^ ^ ^ , S7gO 

TCGT.CGCACTGCGGGCACCGCGCTAGCGGGAGCCACAGCTAGGCGCACTGCGCCCGTAAG ' 



AGCAGCAGCGTGCCGGCAAGACGCTCGAACAGGCCGACCATGCCCGCGACCAGCTGGTTG . 
5761 * ♦ ^ _^ + ^ 5320 

TCGTCCTCGCACCGCGCTTCTGCGACC7TG7CCCGCTGGTACGCGCGCTGGTCGACCAAC . , 

G7GCpGCCC77GGCGAACCAGACGCCGCCGCGCCG77CCAGCGCA7GCA7CAGCGCATAG 
5821 ^ + + + „ 5380 

CACGCCGGGAACCGC77GG7CTGCGGCGGCGCGGCAAGGTCGCG7A^ 

A7CCAGC7GG7CGAAAACGGG77CCCGCCGACCAGCAGGG7G7GCAACGAGAAGGCCTGC 
5881 1— ^ ^^jr..- . — +''5940 

7AGCTCGACCACC7T77GCC.CAAGGGCGCC7CG7CG7CGCACACC77GC7CT7CCGGACG 



CGCAGA7G"CGGC7CC7GGA7GAACCCCGCCACCA7GC7G7GGACCCAGCGG7A'rGCC7GC 
* -^i^--^---- i..^ . ^ — ^ gQ^, 

GCG7C7ACGCCCAGGACC7ACr:TCGCGCCG7GC7ACCACACC7GCC7CGCCA7ACQGACG 

ACCCCCA7CACCGCCCGCGCGGCCtTCACC ATCTGGCCCAGC77CJ\GGAAGGGCG7CC79 . . 

* - — ^ 506 

7CCGCC7ACTCGCGCCCGv.-CCG.*:AAG7CG7ACACCGCG7CGAAC7CC7TCCCGCACCAC 
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CCCAGCTTCAG?.TACCCCTCCCoATAGACCTCCTCGGCGTAArCCSTGGAAGCGGCGATAC 

6061 r!- + -r 6120 

GGGTCGAACTCTArGCC-3AGCCCT;i.TC7GCAGGACCCGCATTAGCACCrTCGCCGCTATC 

CCATCGACATCGGCGGGATTCAAGCAGCCGACCTGGCGGATCAGCTCGTCGTCGTCGTTC 

GGTAGCTGTAGCCGCCCTAACTTCCTCCGCTGGACCGCCTAGTCGAGCAGCAGCAGCAAG 

ACGTATTCGAAGCTCCGGCCGTCCGCCCATGTCAGCCGCTACAAGGGCGAGACCGGCAGC 

TGCATAAGCTTCGACCCCGGCAGGCGGGTACAGTCGGCCATCTTCCCGCTCTGGCCGTCG 

•AGCGTCACGTCACGCTGCATCGGTTGGCC3CTGAGGGCCCACAGCTCTCGCAGGCTG^ ' - " 

TCGCAGTGCAGTGCGAGGTAGCCAACCGGCGACTCCCGGGTGTCGAGAGCCrrC 

.GGGTCGGTCACGACCGTCGGGCCTGCATCGAACACGTGGCCCJGATCGT^ 

CCCAGCCAGTGCTGCCAGCCCGGACG7AGCTTC7 _ 

CXIGCGGCCCCCGGGCTTGTCGCGGGCCTCGACGATGCrrGGTCGCG^^^^ 
63 SI l-J-^-.:-'-:-.—— i--- .: +^^^^-i:-„4:_::^i-:L^il4: 6420 

CGCGCCCGCGGCCCGPJ\CAGCGCCCGGAGCTGCTACCACCA 
AGGCGGAfGGCAAGC 

6421 — --^---4.-:--— — ^ r-:--H. g^ao 

TCCCCCTACCGTTCGCGTTCGGGCGGCTTTGGACGCGGCTACTGCTACCGCCTTGACTAC 



CTCTCTqCTGCAGCACCGGGCGTTCGGGCAGGCAGCCCACpSCCTGCGACAGCG . 

6481 — u — — — - — IV — 6540 

GAGAG^GGACCTCGTCCCCCGCjy^GCCCGTCCGTCGC 
(KrGGGCGTCCCGTG^^ 

6S41 1'^-— — — : — l:^z : ssoo 

CCCCCGCAGGCCACTCCTACGCTTCCGCCACCCGGTTACAGT.CCGCGGGCCGTATCTTCG - 



GCTCGAXCAGCGGCTGCGCCAGGCCGTAGAJSCCGCTGCAG 

6 601 IlJ— -4.--^ 1- 1- — J. + 1, 6660 

CGAGGTAGTGCCCGACGCCGTGCGCGATCTTGGCCACGTCGTCCGCTATCGCTGCCACCC 



GCGCGCACCCCCGGAACAGCATCCGGTTCAGCACCCGCAGCAAGCGGTCCCGATCCGCGC 

6661 i-* + * + !. 6720 

CCCCCGTCCGCCCCTTGTCGTAGGCCAACTCGTCGGCG7CCTTCGCCAGCCCTAGGCGCG ' 



CATCGATCGCCCAGCCCCCCACCGCGCCACGCGCGGACGCGGTCGTCAGGTCGCGCGsICG - 

6721 * -J- + .67 80 

CTAGCTACCGGGTCGGCGCGTOGCCCGCTGCCCCCCTGCCCCACCAGTCCACCCCGCGGC 

CGAtCGCATCCGCGACCTGCGCGGCATAGGGCAGCGAATATCCGGTGACGGGGTC^AACA^ 

6781 * ^. + -i- 6840 

CCTACCGTAGCCCCTCGACGCGCCGTATCCCGrCCCTTATAGGCCACr<^CC ' 

GCCCtCGCCCCAGCCCAACCGGCACCGCCCCCTGCGCCTGGTCGCGC^ 

6841 ; ^ . ^- ^^^ 6 900 

CGGGACGGGGGTCGGGTTGGCCGTCGCCGGGGACGCGCACCAGCGCCGTCTTCGGATACC 

CGTCATGGGCCACCCCCATGCGCACCATCCCCCTtTCGCCCCGCATCTCCTGCCCGGTCC' 

6901 ^ 6960 

GqAGTACCCGGTCGCGCTACCCCTCCTACGGGGAAACCGCCGCGTAGAGCACGGGCCACki 

AGCCCCGCCTGGCGGCATArGTCCAGCGACCCCTGCGCCAGCGCGCCATCGTCCAGATCGC 

6961 — r-T . r-^ ---"^ 7020 

TCGGGGCGGACCGCCCTATCACGTCGCTGCGCACGCGGTCGCGCGGTAGCAGGrCTAGCG 
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CGCCGTCGCTGTACCGCGTATCCTCGA7CACGATGC3CGTGGGACTGAAGCGCAGCAGAT 
7021 ^ ~ ^ ^ + 7080 

GCGGCAGCGACATCGCGCATAGGAGCTAGTCCTACGCCCACCCTGACTTCCCGTCGTCTA 

AGATGAAGCGGTACCCGTCCATCTGCGGA.:^CGGTCGCGTCCATGATCATCGGGCGCTCGA 
7081 ^ -7140 

TCTACTTCGCCATGGGCAGGtAGACGCCTTCCCAGCGCAGGTACTAGTAGCCCGCGAGCT 

CGCCATGGGGGGCGTCGGTCTCGATCTCGACGCCCACGAATTTCTGGAAACCCACGGTCA 

71^1 . ^s. , 7200 

CCGGTACC CC CG GC AGCC AG AGC^^ 

GGTGCGC^TCTCGACCGCACCACGGGCGTCGAfc^ 

CCACGCCCCAGAGCTCCCGTGGTGCCCGCAGCTAGTGCGTCCCrrCGaAGCTAG^ . f 

CGTCCGTCApvGTCGCGCCGGTATCGTCCAGCGTCGCGACATGCGTATTCC^ 
7261 -^--^^ ^ r 7320 

GCAGGCAGTCGCAGCGCGGCCATAGCAGGTCGCAGCGCTGTACGCATAAGGT 



CGACAC'CCTGCAGCAGCCCGATCACCGCGCCCCCCTCGATCGAGGGATAGCCTGTCGT,^ > 

7321 1 — — .1 — 73go 

GCTGTGGCACGTCCTCGGGCTACTCCCGCGGGCGGAGCTAGCTCCGTAT.CGGACAGCAGX - . 

GGCGGCGCGAATGGTCGGGAAACGCGACCTeCTGATCCGTCCATTCCCCCCCACGAATGG 

7381 ^ 4. — -r ^ + i. 7440 

CCGCCGCGCT-TACCACCCCTTTGCGCTGCACGACTAGGC^ 

GCGACAGGCGCGCCAGCCATTCGGGCGAAAGA7CCGTGTCGTGGCAGGACCAGGTGTGC7 
7441 ^ + ^ ^ J. ^ 7500 

CCCTGT.CCGCGCCGTeTCTAMACCCCCCTTTCTAGGCACAG 

GGTCCCAj3GGGCCCCACCCCCCCTCCAGCATCACGATGCCCCCATCCGGTCTGCGGTCGC 

7501 -i- + * ^ h r 7560 

CCAGGCTCCCCGCCCTGGCGCGCAGCTCGTAGTGCTACGCGCGTAGGCCAGACGCCAGCG 

GAACGCCAAGCGCGATCAGCGCACCGCACAGCCCCCCGCCCGCCATCAGCAGATCAtGC5C- ■ 

7561 + r- +- h ». 7620 

CXTGCCGTTGGCGGTAGTCGCGTGGCCTGTCGGCGCGCGGGCGCtACTCGTCtACTACCG 

TCATCTATTGCGATCCGCCCCTTCCCGGTCCTTCAGCAdC(k:GCCCGAGCGT«'CACC * 

7 621 * + ♦ — ► ^ 7 680 

AGTACATAACGCTAGGCGGGGAAGCGCCAGGAAGTCG7CCCCCGGCCT 



TGCCTtGA'GGCfGTCGACCGAGGCGGCCdVGATGAAACCGAAGCTGACGCAGTTCT 
7631 — — „; ^.--.--——-^ 7740 

acggaactccgacacctggctcccgcgggtctactttggcttcgactgcgtcaagAgcgc 

gccatggaccgcgtgatgcatcctgtgtccctggtagacgcgacgaagatagccgc'gctt 

7741 :^~r:" — — -t + r: — :-t • — 7300 

CGGTACCTGGCGCACTACGTACGACACACGGACCATCTGCGCtcCTTCtATCGGCGCGAA 

GGGGACATAGCGGA.ACCCCCACCCCCCATCCACC.AAGCCGTCATGCAGGAAATAGTAGAT 

7301 .rr.T- ^ T . H.--,— —-—i. 7860 

CCCCTGTATCGCCTTGCCGCTCGCCCGTACCTGGTTCGGCAG7ACG7CCTTTATCATCTA 



CAGCCCCTAGCACGTGACCCCCACCCCCACCCACCACGCCAGATCCGACCCCA7CGCGCC 

7861 — -i--- ^.-r--. „+, i. 7920 

GTCGGCCATCGTCCACTGGGCGTGGCGGTCGGTGG7CCGG7C7AGGCTGGGG7AGCGCGG 

GA7CGCGAACACCACGA7CGAGATTACCGCGAAGA7GACGCCATAGAGG7CGT7C77C7C 

7921 -r r-+-- -i- 7980 

CrACCGCT7G7CG7GC7AGCTC7AATGGCGC7TCTACTGCGG7A7C7CCAGCAAGAAGAG • 
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Fi.or. 24/9 



GAGCCCCTGGTCCTCATCCTCGTCGTGGTGCGATTTATCCCACCCCCAGCCCAGGGGGCC 
7981 * . ^ ^ ^ e040 

ctcgcgcaccagcactacoaccac--:accacgctaaatacggtcgsggtcggctcccccgg 
atgcatgatccaccgatcgacggagtaggccgtcagctccatcgcggcgacggtcaggat 

8041 * + + ^ 8100 

TACGTACTAGCTCGCTACCTSCCTCATCCGGCAGTCGACGTAGCGCCGCTGCCAGTCCTA 

GACGGTCACGATTGCCGCCC.^AGTCCTCATGCCGGCCCCTTGCTTGATATGACAGGGAAC 
8101 ^ ^ + , 8150 

CTGCCACTCCTAACGCCCGGTTCACGAGTACGGCCGGGGAACGAACTATACTGTCCCTTG 

AGGCTACSCTGCCCCGCGCTGCATGACCAGCCCATCGGGGTGCGACOOOVGG^^ 

8161 — T-r** — '.'"T -r^* — — h 8220 

TCCCATCCGACGGCGCGCCACGTACTGGTCGGGTAGCCCCACGCTGGTTTCCCGTAOiGC 

TCACATCfcCGTTCACGGCTCATAGGCGCAtCATCCCTGACATTCGCCGCCGiu^CC^ 

8221 ■ — »- -■ «• — r-+- r— rr*— : — 8280 

ACTGTACACGCAACTCCCGAGTATCCGCCTACTAGGCACTGTAAGCGGCGGCttCCC^ 

AGGCGCATCACCCCTTCCGTCGCTGGAAATATTAATGTTTTCCCGAAGATGGTCGGGGCG 

8281 — -+ .-^ — -H^- y ^ 8340 

TCCGCGTAGtGCGCAAGGCAGCGACCTfTATAATTACAAAAGGGCTTCTACC^ ' ^ 

AGACGATTCGAACCTCCGACCTACGGTACCCAAAACCGTCGCGCTACCAGGCTGCGCXAC ' * 
8341 + -r K , K 8400 

TCTCCTAAGCTTCGAGGCTGGATGCCATGGGTTTTGGCAGCGCGATGGTCCGACGCGATG 

GCCCCGACTGCGGAAGCCTTTAGCCGATTGTTCCGCCAAGGGAAAGACCTAGTC 

8401 * + + + , -M. 84 60 

CCGGGCTGACGCCTTCCGAAATCGGCTAACAAGGCCGTTCCCTTTCTCGATCACCGTCCG ' 



CAGGACCCCATTGTCCCCC^TCCCCGCATGCCCCATCGGCTGACCGGGCtTCAGGCCAAC • 
8 4€1 + 1- 1. 8520 

GTCCTGGCGTAACAGCGGGTACGGGCCTACGCGGTAGCCGACTGGCCCGAACTCCQSttC* • 

GCGATCCGCCTCTCCGCCCGCGATTTCGAGGACGAACAGCCGCTCGGGGTCCGGATCGCC • 

852 1 + • -h-= + 1. K k- 8580 

CCCTA<XX:CGAGACGCCGGCGCTAAAGCTCCTGCTrGTCGGCCAGCCCCAGGCCT 



GACCGCCGCCCCCCGAATGGGCCTCTCGTCCAGCCGCCGCGCATTCCGCTGGATG^ 

8581 — + — —-4.— -"-t-^--"* I + 8 64 0 

CTGGCCCCqCGGGCCTTACCCGCAGACCAGCTCGCCCGCGttTAACGCCACCTACAC^ : . 



GATGACCCCGGTTTCATCCGCAAACACCATGTCCAGCGGCATCAGTGTCrrGCGtt * • 

8 641 — j.« ■ ■ » - ^ . . ' t 8700 

CTACTCCGGCCAAAGTAGGCGTTTCTGGTACAGGTCGCCCTAGTCACACAACGCGXAGGT • 

GAAGGACACCGCCTGGGGCGATTCGTAGATgAACAGCATTCCGGTGCCCGCAGGCAGCTC 
8701. . + — K 8760 

CTTCCTCTCGCCGACCCCCC7.\AGCATCTACTTGTCGTAAGGCCACCGGCGTCCGTCGAG 

CTTGCGGAACATCACGCCCTC."GCGCGCTCTTCGGGGCTGTCCGCGACCTCGACCCGAAA 
8761 + ^ f. ^ — ^ 8820 

GAACGCCTTGTACTCCGGGACGCGCCCGAGAACCCCCGACAGGCGCTGGAGCTGGGCTTT 

CCCCACCGTTTCCCCACCCXTATCGACGACAAGACTGCCGGGCGCGCATTCCACCCCCGC 
8821 — ^— ' — — ♦ ^ — : + — ^ :+ 8880 

CCCCTCCCAAACGCCTCGCCATAGCTGCTGTTCTCACGGCCCGCGCCTAAGGTGGCGGCG 

CGCGGCCCCGGCCATCAGCACCGCAACAAGCCCTGCCGCCTTACTCGGCCACATGGCCAA 
8881. - — 8940 

GCGCCGCCGCCCGTAGTCCTCGCC7TCTTCGCCACGCCGCAATGAGCCGGTGTACCCGTT 



CATACGACTCCTCCCCCCCGACATCCCCCGCCC7CCACGAATTCGATATCAAGCTTATCC 
8941 — — - — ♦ — - — — - — — — — -.--^-^ 9000 

CTATCCTGACGAGCCGCCGCTCTACGGGGCCCCACGTCCTTAACCTATACTTCCAATAGC 
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Fi<r, 24/10 

ATACCGTCCACCTCGACGCGGGGCCCSGTACCCACCTTTTGTTCCCTTTACTGAGGGTTA 

9001 ' — + -r + -f. 9060 

TATGGCACCTGGAGCTCCCCCCCGGGCCATGGGTCGAAAACAAGGGAAATCACTCCCAAT 

ATTCCGCGCTTGCCGTAATCATCG7CATAGCTCTTTCCTGTGTGAAATTCTTATCCGCTC 

9061 — ' — * * »- h 9120 

TAACGCGCGAACCGCATTAGTACCACTATCGACAAAGGACACACTTTAACAATAGGCGAG 



ACAATTCCACACAACATACGAGCCGGAAGCAT>AAGTG7AAAGCCTGGGGTGCCTAATGA 

i- ^ ^ ^ ^ ^ 52^gQ 

TGTTAAGGTGTGTTGTATGCTCGGCCTTCGTATTTCACATTTCGGACCCCACGGATTACT 



GTGAGCTAACTCACATTAATTGCGTTCXIGCTCACTGCCCGCTrr 

9iai - — - — 1— +L_"-.«r„-..-T ^.-i-r:—::':^.::-^ .9240 

CACTCGATTGAGTGTAATTAACGCAACGCGAGTGACCKSGCG^^ 

TCGtGCCAGCTGCArT?y3^TGAAtCiGGC.CA^ 

9241 — : -liri K - f- 9300 

AGCACGGTCGACCT?aTTACTTACCCGGTTCCGCCCCCCTCTCCK 

CGCrrCTTCCGCTTCCTC^^ 

9301 ~ ^— U:-!' — : — — .ir-^.--i._-_:--:-l:4. ^-h — — -'" - > 93 60 

GCCAGAAGGCGAAGGAGCCAGTGACTCAGCCACGCGAGCCACCAAGCCGACGCCGC7CGC 



GTATGAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACG^ 

9361 ^ + — — h — 9420 

CATAGTCGAGTJSAGTTTCCGCCA^^ 

AAGAACATGTG^GCAAAAG^ 

9421 + -+ — — — ---1" : — ' 'i " — :+ 9480 

TTCrXGTACACTCGTTTTCGCGTCGrrTTTCCGGT^ 

GCGTTTTTCCATAGGCTCCGCCGCGCTGACGAGCATCAC^U^AAATCGAGC^ 

94 81 -J- + — ^ — — :' . ; 9540 

CGCAAAAAGGTATCCGAGGCGGCGGCACTCCTCGTACTGTTTTTAGCTTCGAGT^ 



AGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGGTCCCTC 

9541 ^ i- + ^ — + 9600 

TCCACCGCTTTGGGCTGTCCTGATATlTCTATGGTCCGCAAACGCGGACCTTaSACCGAG, 



GTGCGCTCTCCTGTTCCGACCCTGCCGC7TACCCCATACCTGTCCGCCTTTGTCCCTTCG 

9601 ^ > »- 9660 

CACGCGAGAGGACAJ^CGCTCGGACGGCGAATCGCCTATGGACAGGCGGAAAGACGGXACC 



GGAAGCGTGGCGCTTTCTCATAGCTCACGCTCTAGGTATCTCAGTTCGGTCTACGTCGTT^' 

96S1 + + , K 1. 9720 

CCTTCGCACCGCCAAAGAGTATCGAGTCCGAiCATCCATAGAGTCAAGCCACATCCACCAA 

CGCTCCAACCTCCCCrCTGTGCACCAACCCCCCCTfCACCCCGACCTC 

9721 Tr— -T •.T""".**-/- • * ' *" ^''80 

GCGAGGCTCGACCCCACACACGiTGCTTGCGGGG 

CGTAACtAtCGTCTTCAGTCCAACCCGGTAAGACACGACTtATCCCCACTGGCAG 

. CCATTGAT AGCACAACTCACGTTG 

ACTGGT^XcXGGATTAdCAG^GCGAGCTATGTAGGCGCTCCTACAGAGTTCT 

9941 — — .--T — r--,r-* * — 7 r-+ : — : — r — . ^^00 

TCACCATTGTCCTA^rCGTCTCGCTCCATACArC 

TCGCCTAACTACCGCTACACTACAAGGACACTATTTCGTATCTCCCCTCTCCTGAACcdA 
9901 * r ^.--i-„--— 4. 9960 

. ACCGGATTGATCCCCATCTGATCTTCCTGTCATAAACCATAGACGCGACACGACtTCGGT 
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CTTACCTTCGCAAAAAGAGT7GGTACCTCTTGATCCGGCAAACAAACCACCCCTGC7ACC 

9961 * * * + ▼ 10020 

CAATGGAAGCCTTTTrCTCAACCATCGACAACTAGGCCGTTTGTTTGGTGGCGACCATCG 

GGTGCTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGAT 

10021 * + 10080 

CCACCAAAAAAACAAAC.GTTCCTCGTCTAATGCGCGTCT7TTTTTCCTAGAGTTCTTCTA 

CCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTfAAGGGATT 

10081 + + +- 10140 

GGAAACTAGAAAAGATGCCCCAGACTGCGAGTCACCTTCCTTTTGACTGCAATTCCCTAA 

TTGGTCATGAGATTATCAAAAAGGA7CTTCACCTAGATCCTTTTAAATTAAAAATGAAGT 

10141 '-"^ + — ~+ ' — ^r-r ' ' ' . 1 10200 

AACCAGTACTCTAATAGTTTTTCCTAGAAGTGGATCTAGGAAAATTTAATT^ ' 

TTTAAATCAATCTAAACTATAfATGAGTAAACTTGGTCTGACAGTtACCAATGCTTA^ 

10201 — — + + — 1— + ' — -r :: — »• — : — 10260 

AAATTTACtT AG ATTTCAT A TAT ACTCA TTTG AACC AG ACTGt 



AGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTfCATCCATAGTTGCCT^CTCCCC 

10261 ^-r r —1*—. —7+ 10320 

TCACfcCCTGGATAGAGTCGCTACACAGATAAAGCAA^ 

CTCGTGTAGATAACTACGATACGGGACGGCTTACCATCTGCCCCCAGTGCTGCAATGArA 

10321 . -•»• r :~~-t .-+~ — — — — ! 10380 

CAGCACATCTATtGATCXTATCCCCTCiCCGAATGGTAGACCGGGGTCACGACGTTACT 

CCCC&lGACCCACGCTCACCGGCTCCAGAtrrATCAGC>ATAAAc£vGCCAi^ 

10381 ?rr--T — * — * 104 40 

GGCGCrcrGGGTGCGACTGGC'cCAGGTCTAAATAGTCGTtATTTGGfCGCT 

GCCGAGCCCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAArrGrrCC 
CGGCTCGCCTCTTCACCAGGACGttGAAATAGG^ 

CG<k;PJVGCTAGAGtAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTCTTGCCATTC^ 

1Q501 — — +- - — — ■»• — - — — » ' »- 105 60 

GCCCTTCCATCTCAtTCATCAACCGGTCAATTATCAAACCCGTTGCAACAACCGTAACGA \ 

ACAG<k:ATCGTGGTCTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAA- 

10561 + — — — + I t 10620 

TGTCCGTAGCACCACACTGCGAGCACCAAACCATACCCAAGTAACTCGAGGCCAAGGGTT 



CCATCAAGGCGAGTTACATGATCCCCCATGTTCTGCAAAAAACCGGTTACCTCCTTCGGT 

10621 — — — + + -t- + -♦■ 10680 

GCTACTTCCGCTCAATCTACTAGGGGGTACAACACCTTTTTTCGCCAATCGAGCyvAGCCA . • 

CCTCCGATCCTTGTCAGAAGT.aj^GTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCA 

10681 + + '-^ + 10740 

GGAGGCTAGC;iJ^CACTCTTCATTC>ACCCGCGTCACAATAGTCACTACCAATACCGTCGT 

CTGCATAATTCTCT7ACTGTCATGCCATCCGT.\.=»GATCCTTTTCTGTGACTGGTGAGTAC . 

10741 ' ^ 10800 

GACGTATTAAGAGAATGACAGTACGGTAGGCATTCTACGAAAAGACACTCACCACTCATG^ 

TCAACCAACTCATTCTCAGAwATAGTGTATGCGGCGACCCAGTTGCTCffGCCCGGCGttt 

10801 — + -J—- — + '• ' 10860 

AGTTGGTTCAGTAACACTCTTATCACATACGCCGCTGGCTCAACGAGAACGGGCC^ 

ATACOGGATAATACCGCCCCACATAGCAGAACTTTW^AACTGCTCATCATTGa^ 

10861 — — - — - — f — — — -K 10920 

TATGCCCTATTATGCCCCGGJ.GTATCGTCTTG>JVATTTTCACGAGTA^ 

TCTTCGCGCCCAAAA.CTCTCA.AGGATCTTACCGCTGTtGAGATCCAGTTCGXTGTAACCC 

10921 * " * + — 10980 

AGAACCCCCGCT7TTGACACTTCCTAGAATCCCGACAACTCTAGG7CAACCTACATTGGG 
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* ' Ft^'. 24/12 r . : • / 

ACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGtTTC 

TGAGCACGTGGGTTCACTACAAGTCGTACAAAAtGAAACT.CCTCCC>^ 

AAAACACGAAGGCAAAATGCCGCAAAAAAGGGAATAAGCGCGACACGCAAATGTTCA^ - 
11041 ^ — -r -i r" illoo 

rrttctccttccgttttacggcgttttttcccttattccccctctgcc'rtta 
ctcatactcttcctttttcXatattattgaaccattt^^^ • 

1 1 101 z' '"""^ * """^ ^--t- 11150 

GACTATGAGAACGAA-^AAGTTATX^T AACTTCGTAJ^TAGTCCCAATAJVC^ ' 

GGATACAtATTTCAATGTATTTACA;»A?.ATAAACAAATAGCGGTTCC^ 

CCTATGTATAAACTTACATAAATCTTTTTATTTGTTtATC^^ 

CGAAAACTGCCAC . ' . 

11221 r-^— 3^1233 : h 

GCTTTTCACCGTG \ - 
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Fig. 26 
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Fig. 29 
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Fig. 30/1 



ACTGTAGICTGCGCGGATCGCCGGTCCGGGGGACAAGATATGAGCGCAC^ 
1 * + + + + + + 60 



iU^GGCAGATCTGACCGCCACCACTTTGATCGTCTCGGGCGGCA 
61 + + + + + + 120 

ttccgtctagactggcggtggtcaaactagcagagcccgccgtagtagcggcgcac 
gccctgcatctgcatck:gctgtggtttctggacgcg^ 

121 -t- + +— + + + 180 

CGGGACGITICACGTACGCGACACCAAAGACCTGCGCCGCCGCGTAGGGTAGGACCGCCAG 

GCGAATITCCTGGGGCTGACCTGGCnH^ 

181 — -1 +-:._J^_J-_+_^-^. — . 240 

CGCTTJ^GGACCCCGACTGGACCGACAGCqvCSCCAGACAAGT 

ATGCATGGGTCGGTCGTGCCGGGGCGCCCGCCXrGCa^TCCGG^ 

241 + 300 

TACGTACCCAGCCJiGCACGGCCCCGCGGGCGCGCfXnrrAC 

CTGTGGCTGTATGCCGGATTTTCCTGGCGCAAGATGATOT 

301 -+7— ^ — +— ~— + 360 

GACACCGACATACGGCCTAAAAGGACCGCGTT^ 

CCX:CATGCCGGAACXGACGACGACCCAGATTTCGACCATC^ 
361 -4:—.--- .^-----,---4.1-____^-V;:-^ 4- ' 420 

gcggtacggccttcgctkkno^^ -.. 

gcccgcttcatcckx:acctattix:ggctggcgc^ 

421 — -~-r.'*"--'"*r-~""""~t — :~" — — r— r-+ 480 

cggocgaagta<k:cgtggataaagccgaccgcgc^ 

acggtcp^tgcgctgatgtig^ 

TCXXASATACXXXl^ 

TCGATCCTGGC<nxa3ATCCACCTC?rrCGTGT^^ 

541 — — — -.1— ul+ — — + — - — — + 600 

AGCTAGGACCGCkcXTTAGGTCG^ 

601 + + + + + 660 

GTGCTGOXyUVGGGCCTGGCGGltnTACGCGCCAGCAGCXCCT 

CTCCrGA CClWr i - ilAC iaiXj GCGGTTATCATCACGAACACCACCTGCACCCGACGGTG 

661 + + + + + + 720 

GACGACTOGACGAAAGTGAAACCGCCAATACTAGTGCiUV i XSGTGGACGT^^ 

CCTTOGTGGCGCCTGCCCAGCACCCGCACCAAGGGGGACACCGCATCACCA^ 

721 + + + + + 780 

GGAACCACCGCGGACGGGTCgrGGGCG lX 3C ' i ' i X:CCCCTGTGGCGTACl'U5TT AAAGGACT 

TCCTCGT CG CCACCGTGC'TGGTGATGGAGCTGACGGCC^^ 

781 -I- * + + + ♦ 840 

AGCAGCAGCGGTGGCACGACCACTACCrCGACTGCCGGATAAGGCAGGTGGCGACCTACT 
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Fig. 30/2 



841 



TGCACGGCCCClTCkXKriXXXSGCrc^^ 

— •■ — +=• — — ^ — + : — : — — : ^ 

ACGTGCCGGGGAACCCGAGGCGGACCGTCTTC^ 



900 



TGGAAAAGARCGJWCXTXjtACGGC • 
901 pgo* 

^XGCCGGACCAGAWVCGCCACTAGCGGT^C^ 



TGGCkrrGGAlCTGCkXIACeGGTCC^ ' 

961 + 4..- 1— 

ACCCGACCTAGACCCGTOkxrAG^ - " 



1020 



1021 



ACTAGATAAAGCAGGACGTACTGCCCGACCACGTAGTCGCGACCGGCAAfeGCC^ 



CTCckjUVGGGCTATGCCAGACGCCrCT 

1081 - — : — i.-+---------+---™---w-- ^^4— + 

GAGCGlTCCXGATikCiSCTXrE^ 



1140 



1141 



•+ 1200 

CCGCGCTOCTAACGO^GTCGAAGCCGAAGtAGAfACGCGGCC^ 



1201 



AGGACCTGAAGACGTCGGGCGTCCTGCGGGCCGAGGC 

— — "*•- — + 

TCCTGGACTlxrRKAGCCCGCACGACGCCCGGCICCGCGrCC^ 



12S0 



1261 - 1261 
G 



102 



EP 0 872 554 A2 



Fio. 31 



ATGAGCGCACATGCCCTGCCa^GGCAGATCTGACCGCCACOVGT^^ 
1 ^ + + + + + 60 



GGCATCATCGCCGCGTGGCTGGCCCnxXATGTXXZATGCGCTGTG Grri^ ^ 

61 + + + + + 120 

CCGTAGTAGCGGCGCACCGACCGGGACGTACACGTACGCGACACCAAAGACCTGCGCCGC 

GCGCATCCCATCCTGGCGGTCGCGAATTTCCTGGGGCTGACCTGGC^ 

121 4. +- — + + + 180 

CGCGTAGGGTAGGACCGCCAGCGCTTAAAGGACCCCGACTGGACCGACAGCCAGCCAGAC 

TTOITCATXX^CGCATGACGCGATGCATGGGTCGGTCGTG^ 

181 + f + + + + 240 

AAGTAGTAGCGCGTACIGCGCTACGTACCaiGCCAGCACGGCCCCGCGGGCGCGCGGTTA 

GCGGCGATGGGCCAGCTTGT C CTGTGGCTGTATGCCGGATr i TCCTG(X:G<^ 

241 + ♦ + + 300 

CGCCGCTACCCGGTCGAACAGGACACCGACATACCKjCCTAAAAGGACCGCGrrCT 

GTCAAGCACATGGCCOVTCATCGCCATGCCGGAACCGACGACGACCOV^ 
301 -* ■»- + + + + 360 

GGCGGCCCGGTCCGCTGGTACGCCCGCTTCATCGGCACCTATXTCGGCTGGCGCGAGGGG 

361 — :*---T r+— -rr + — — »- ^ 420 

CCGCCGGGCCAGGCGACCATCxiGGGCGAAGfAGCCGTGC^ 

CTGCTCXriGCCCGTO^TCGTGACGGTCrATGC 

421 + + + + -r + 480 

GAOGACGACGGGCAGTAGCACTCCCAGATACGCGACTACAACG^ . ' 

GTGGTCTTCriXSGCCGTTGCCGTCGATCCTGGCC^ . 

CACCAGAAGACCGGCAACGGCAGCTA<^ 

TGGCTCCCXCACCGCCCCGGCCACGACGCGTTCCCGGACXGCCACAATGCGCGG^ 

ACCGACGGCGTCGCGGGGCCGGTGCTGCGCAAGGGCCKXX: 

CGGATCAGCGACCCCGTGrCGCTGCTGACC lX.CrriXACl ' 1 - rG G^ 

601 + + + + 660 

GCCTAGTO GC TGGGGCACAGCGACGACraSACGAAAGTGAAACCGCCAATAGTAGT^^ 



661 ♦ + + + + 720 

GTCGTW^ACGTGGGCTGCCACGGAACCACCGCGGACGGGTCGTG^ 

ACCGCATGA 

721 729 

TGGCGTACr 
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Fig. 32 



1 MSAHALPKAD-LTATSLIVSG GIIAAWLALH. yHALWFLDAA AHPILAVANF 

51 LGLTWLSVGL FIIAHDAWHG .SyVPGRPRAN AAMGQLVliWL..,yAGFSWRXMI 

101 VKHMAHHRHA GTDDDPDFDH ,-GGPVRWYARF IGTYFGWREG LLLPVIVTVY 

151 ALMLGDRWMY WFWPLPSIL ASIQLFVFGI WLPHRPGHDA FPDRHNARSS 

201 RISDPVSLLT CFHFGGYHHE HHLHPTVPWW RLPSTRTKGD TA* 
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Fig. 33 



ATGACCAATTTCCnXaATCGTCGTCGCCACCGTGCTGGTGAlX^ 
1 + + + + + + 60 

TACTXKrnTAAAGGACTAGCAGCAGCGGTGGCACqACCACT 

GTCCACCOCTGGATCATGCACGGCCCCTTGGCXrrGGGGCTXXX:^ 

61 + + + + + + 120 

CAGGTGGC3GACCTACTACGTGCa;GGGAACCCGACCCCGACCGTCyiT^ 

GAACACGACCACGCGCTGGAAAAGAACGACCTGTACGGCCTGGTCTTIX^ 

121 + + + + 4- -»- 180 

CUTOitSC'IX X SIGCGCGAC CriTiCnXX rrGGACATGCCG^ 

ACG G ' IWiWriXA CGGTCGGCTGGATCTCGGC^ 

181 + + + + : + + 240 

TGCOVCGACAAGTCCa^CCCGACCrAGACCCGTGGCCAGGACAC^ 

ATGACCGTCTACGGGCTGATCTATTTCCrrCCTGCATGACGGGC^^ 

2C1 + + + + 300 

TACTGGCi^lXXZCCGACTAGATAi^GCAGGACGTAClXS^ 

CCGTTGCGCTATATXXCnCGCAAGGGCTAT^ 

301 + + +- + + 360 

GGCAAGGCGATATAGGGAGCCTTCCCGATACGro 

CACCACGCGGTCGAGGGGCGCGACCAITO 

361 ^^—-^4: ^Jl-^-JL^^^ — — 1— :+* 420 

GTQGTGCGaaWXrrCCCCGCGCrGGTAACGCAGTCGAAGCCGAAGTAGATACGCGGC 



421 + + + — — + — — + 480 

CAGCl X J ITC GACriC G 'lCCTGGACrrCi'GCAGCCCGCACGACGCCC^^ 

CGCACG 

481 486 

GCGTGC 
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Fig-'34 



1 MTNFLIWAT VLVMELTAYS VHRWIMHGPL GWGWHKSHHE EHDHALEKND 

51 LTCLVFAVIA TVLE^GWIW AJPVLW\^IALG MTVYGLIYFV LHbGLVHQRW 

101 PFRYIPRKGY ARRLYQAHRL HHAVEGRDHC V3FGFIYAPP VDKLKQDLKT 

151 SGVLRAEAQE RT 
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Fig. 35 



Hirxdm/ Pstl/ Xbal/BamHI/ Xhol 
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Fig. 36 




108 



EP 0 872 554 A2 



Fig. 37 
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Fig. 38/1 

CTCCAGGTCTGACACGGCCAGAAGGCCGCCCCGCGGGcCGGGGGCCCCcGCATCGCGACC 

+ — + + + + 

GACGTCCAGACTGTGCCGGTCTTCCGGCGCGCCGCCCgGCCCCCGGCCgCGTAGCGCTGG 



GGTATCCTTGCCAAGCGCCGCCTGGTCGCCCACaACGTCCAGCAGGTCGTCATAGGACTG 
61 + + ^ + + + 

CCATAGGAACGGTTCGCGGCGGACCAGCGGGTGtTGCAGGTCGTCCAGCAGTATCCTGAC 

GAACACCCGGCCCAGCTGACGGCCAAAGTCGATCATCTGaGTCTGCTCCTCGGCGTCGAA 

121 + 1— — + + + + 180 

CTTGTGGGCCGGGTCGACTGCCGGTTTCAGCTACTAGACtCAGACGAGGAGCCGCAGCTT 

CTCCTTGATCACGGCCAGCATCTCCAGCCCGGCGATGAACAGCACGCCGGTCTTCAGGTC 

181 + + V— + 240 

GAGGAACTAGTGCCGGTCGTAGAGGTCGGGCCGCTACTTGTCGTGCGGCCAGAAGTCCAG 

CTGTTCCTGTTCGACCCCCGCGCCGTTCTTGGCCGCGTGCAGGTCCAGGTCCTCGCCGGC 

241 + + + + : + 300 

GACAAGGACAAGCTGGGGGCGCGGCAACAACCGGCGCACCTCCAGGTCCAGGACCCGCCG 

• y ^ • , 

GCACAGGCCCTGCGGCCCCAGGGACCGCGACAGGATCCgcaccagcCgcgcccgcaccgr 
301 ♦ -1- + + + + 360 

CGTGXCCGGGACGCCGGGGTCCCTGGCGCTGTCCTAGGcgtggtcgacgcgggcgcggca 

gcccgacgcgccgcgcgcaccggccagcagggccatcgcctcggcgatcagggcgacgcc 

361 — — + ♦ p — + 420 

cgggccgcgcggcgqgfcgcggccggrccgtcccggtagcggagccactagticccgccacgg 

gcct agcacg^cgcggctt c cgccat gcgccacatgggt cgcgggct ggccgcggcgcag 

421 ♦ ♦ 'r^y: + 480 

cggatcgcgccgcgccgaaagcggcacgcggcgtacccagcgcccgaccggcgccgcgtc 

cccggcatcg^ccacgcagggcaggtcgccgaagatcagcgatgcggcargcaccarctc 

481 4- : 4--. + + ^ ^ 540 

gggccgc agcaggc acgrccccgc ccagcagcct ct agt cgcc acgccgt acgtggt agag 

gaccgcgcaggcggcgtcgacgatcgtgccgcagaccccgcccgaggcttctgccgcaag 
541 ^ ^ ' + + ; 600 

ctggcgcgtccgccgcagctgccagcacagcgtctggggcgggctccgaagacggcgttc =. 

cagcarcagcacgccgcggaaacgcctgcccgacgacagcgcgccacggctcatggccgg 

601 ♦ ♦ +— + Y 660 

gtcgcagtegtiacggcgcccctgcgaacgggccgctgtcgcgcggraccgagcaccggcc 

gccgagcggecgcgacacggeaccgaacccccgggcgac.ctcctcaagtctggtctgcag 
661 — ———4— ———■»•— —*-™+—— — — — +— 720 

cggctcgccgacgctgcgccgcggcttagggacccgctagaggagctcagaccagacgtc 

aagggtggcgtggatcgggccgacgtctcgtctcatcagtgcccccgcgcrtgggtcctg 

721 + + + — + — — ^ + 780 

ttcccaccgcacctagcccaactgcagagcagagtagtcacggaagcg'cgaacccaagac 

accaggcgggaaggticaggccggggcggcaccccgtgacccg^catccaccgccaacagt 

781 ♦ + + — »- -f 840 

tggcccgcccctccagcccggccccgccgtggggcactgggcagtaggtggcagttgtca 

ccccacgtcggaaggctccacgcccgatcgcgagccttttcgacggcgacgcggggtcgc 
841 + ♦ + + + + 900 

ggggtacaacctcccgaagcgcgggctaacgctcggaaaagctgccgctgcgccccagcg 

gcggcaatctncccaacaaggrcagcggaccggcgcgccgatggccgcgcgcagccaggc 
901 i- ♦ ♦ + ^ ^ 960 

cgccgttaaanaggctgtcccagtcacctggccgcgcggccaccggcgcgcgccggtccg 

acccctggccggaaacacccgcgccgcaccaCgatcggccaggatcgtccggcgcgcggc 
961 + 4- + + ♦ + 1020 
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Fig. 38/2 



caggaaccqgcccccgcgggcgcggcgcagcactagccggtcccagcaggccgcgcgccg 

gcggcgcaggccggccgcgrcacccggartgccaagcacccaggccaccgcgcccgcgac 
1021 ™> ^ ..^ . 

cgccgcgt:ccagccggcgcagt:gggcci:aacagtCcgtgggt:ccggtagcgcaggcgcT;g* 



cccgcccgcgtcgtccatgccgacgaccaggccgttctccacgccgcggaccagttcgcg 

gagcaggcgcagcaggtacagctgccagrccggcaagaggtacagcgcccggccaagcgc 

caccggggcggtgttcgatcgaccaccaggcacccggcggccatcgcctcggacagggac 

gcggccccgccacaagccagcn agtggcccgc aggccaccggcagcggagcctgc ccctg 

caggaggcgacgaagggcncggtgaaacagacacgcgcgtgcgaggcctgcag 
1201 1253 

gtcctccaccgcctcccgagccaccctacccgcacgcgcacgccccggacgcc 
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1200 
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Fig. 39 

ATGAGACGAGACGTCAACCCGATCCACGCCACCCTTCTCCAGACCAGACTTGAGGAGATC 

1 ^ + ♦ ♦ ♦ + 60 

TACTCTGCTCTGCACTTGGGCTAGGTGCGGTGGGAAGACGTCTGGTCTGAACTCCTCTAG 

GCCCAGGGATTCGGTGCCGTGTCGCAGCCGCTCGGCCCGGCCATGAGCCATGGCGCGCTG 

61 * + + + ♦ + 120 

CGGGTCCCTAAGCCACGGCACAGCGTCGGCGAGCCGGGCCGGTACTCGGTACCGCGCGAC 

TCGTCGGGCAAGCGTTTCCGCGGCATGCTGATGCTGCTTGCGGCAGAAGCCTCGGGCGGG 

121 + ♦ + + 180 

AGCAGCCCGTTCGCAAAGGCGCCGTACGACTACGACGAACGCCGTCTTCGGAGCCCGCCC 

GTCTCCGACACGATCGTCGACGCCGCCTGCGCGGTCGAGATGGTGCAXGCCGCATCGCTG 

181 * * ♦ + * 240 

CAGACGCTGTGCTAGCAGCTGCGGCGGACGCGCCAGCTCTACCACGTACGGCGTAGCGAC 

ATCTTCGACGACCTGCCCTGCATGGACGATGCCGGGCTGCGCCGCGGCCAGCCCCCGACC 

241 ! + * + ♦ * 300 

TAGAAGCTGCTGGACGGGACGTACCTGCtACGGCCCGACGCGGCGCCGGTCGGGCGCTGG 

CATGTGGCGCATGGCGAAAGCCGCGCCGTGCTAGGCGGCATCGCCCTGATCACCGAGGCG 

301 +- T * + T+ 360 

GTACACCGCGTACCGCTTTCGGCGCGGCACGATCCGCCGTAGCGGGACTAGTGGCTCCGC 

. ATGGCCCTGCTGGCCGGTGCGCGCGGCGCGTCGGGCACGGTGCGGGCGCAGCTGGTGCGG 

361 — — ♦-r---—"-'^ — — . . — ♦ 4 20 

TACCGGGACGACCGGCCACGCGCGCCGCGCAGCCCGTGCCACGCCCGCGTCGACCACGCC 

ATCCTGTCGCGGTCCCTGGGGCCGCAGGGCCTGTGCGCCGGCCAGGACCTGGACCTGCAC 

421 — t — rr"*: — ----- . — - — + 

TAGGACAGCGCCAGGGACCCCGGCGTCCCGGACACGCGGCCGGTCCTGGACCTGGACGTG 
- GCGGCCAACAACGGCGCGGGGGTCGAACAGGAACAGGACCTGAAGACCGGCGTGCTGTTC 

481 r.r: — * — -T-rr— r-+— +-rr — + 540 

CCCCGGTTCTTGCCGCGCCCCCACCTTGTCCTTGTCCTGGACTTCTGGCCGCACGACAAG 

ATCGCCGGGCTGGAGATGCTGGCCGTGATCAAGGAGTTCGACGCCGAGGAGCAGACTCAG 
541 — + — + + + 600 

TAGCGGCCCGACCTCTACGACCGGCACTAGTTCCTCAAGCTGCGGCTCCTCCTCTGAGTC 

ATGATCGACTTTGGCCGTCACCTGGGCCCGGTGTTCCAGTCCTATGACGACCTGCTGGAC 

601 + ♦ * + * -». 660 

TACTAGCTGAAACCGGCAGTCGACCCGGCCCACAAGGTCAGGATACTGCTGGACGACCTG 

GTTGTGGGCGACCAGGCGGCGCTTGGCAAGGATACCGGTCGCGATGCGGCGGCCCCCGGC 

661 + + * ♦ * 120 

CAACACCCGCTGGTCCGCCGCGAACCGTTCCTATGGCCAGCGCTACGCCGCCGGGGGCCG 

CCGCGGCGCGCCCTTCTGGCCGTGTCAGACCTCCAGAACCTGTCCCCTCACTATGAGCCC 

721 + * * + ♦ + 780 

GGCGCCGCGCCGGAACACCGGCACAGTCTGGACGTCTTGCACAGGGCAGTGATACTCCGG 

AGCCGCGCCCAGCTGGACGCGATGCTGCGCAGCAAGCGCCTTCAGGCTCCGGAAATCGCG 

781 + ♦ + + ♦ 1" 840 

TCGGCGCGGGTCGACCTGCGCTACGACGCGTCGTTCGCGGAAGTCCGAGGCCTTTAGCCC 

GCCCTCCTCCAACGCGTTCTGCCCTACGCCGCGCGCGCCTAG 

841 ♦ ♦ * 882 

CGGGACCACCTTCCCCAAGACCCGATGCGGCGCGCGCGCATC 
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X MRRDVNPIHA TLLQTRLEEI AQGFGAVSQP LGPAMSHGAL SSGKRFRGML 

51 MLLAAEASGG VCDTIVDAAC AVEMVHAASL IFDDLPCMDD AGLRRGQPAT 

101 HVAHGESRAV LGGIALITEA MALLAGARGA SGTVkAQLVR II.SRSLGPQG 

151 LCAGQDLDLH AAKNGAGVEQ EQDLKTGVLF lAGLEMLAVI KEFDAEEQTQ 

201 midfgrqi.gr VFQSYDDLLD WGDQAALGK DTGRDAAAPG PRRGLLAVSD 

251 LQNVSRHYZA SRAQLDAMLR SKRLQAPEIA ALLERVLPYA ARA* ^ f 
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Fig. 41 
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Fig. 42 
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